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Preface 



This book is intended to be used as the text for a course in combinatorics at the level 
of beginning upper division students. It has been shaped by two goals: to make 
some fairly deep mathematics accessible to students with a wide range of abilities, 
interests, and motivations and to create a pedagogical tool useful to the broad spec- 
trum of instructors who bring a variety of perspectives and expectations to such a 
course. 

The author's approach to the second goal has been to maximize flexibility. 
Following a basic foundation in Chapters 1 and 2, each instructor is free to pick 
and choose the most appropriate topics from the remaining four chapters. As sum- 
marized in the chart below, Chapters 3-6 are completely independent of each other. 
Flexibility is further enhanced by optional sections and appendices, by weaving 
some topics into the exercise sets of multiple sections, and by identifying various 
points of departure from each of the final four chapters. (The price of this flexibility 
is some redundancy, e.g., several definitions can be found in more than one place.) 

Chapter 1 



Chapter 2 



Chapter 5 



Chapter 3 



Chapter 4 



Chapter 6 



Turning to the first goal, students using this book are expected to have been 
exposed to, even if they cannot recall them, such notions as equivalence relations, 
partial fractions, the Maclaurin series expansion for e x , elementary row operations, 
determinants, and matrix inverses. A course designed around this book should have 
as specific prerequisites those portions of calculus and linear algebra commonly 
found among the lower division requirements for majors in the mathematical and 
computer sciences. Beyond these general prerequisites, the last two sections of 
Chapter 5 presume the reader to be familiar with the definitions of classical adjoint 
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(adjugate) and characteristic roots (eigenvalues) of real matrices, and the first two 
sections of Chapter 6 make use of reduced row-echelon form, bases, dimension, 
rank, nullity, and orthogonality. (All of these topics are reviewed in Appendix A3.) 

Strategies that promote student engagement are a lively writing style, timely and 
appropriate examples, interesting historical anecdotes, a variety of exercises (tem- 
pered and enlivened by suitable hints and answers), and judicious use of footnotes 
and appendices to touch on topics better suited to more advanced students. These 
are things about which there is general agreement, at least in principle. 

There is less agreement about how to focus student energies on attainable objec- 
tives, in part because focusing on some things inevitably means neglecting others. If 
the course is approached as a last chance to expose students to this marvelous sub- 
ject, it probably will be. If approached more invitingly, as a first course in combi- 
natorics, it may be. To give some specific examples, highlighted in this book are 
binomial coefficients, Stirling numbers, Bell numbers, and partition numbers. These 
topics appear and reappear throughout the text. Beyond reinforcement in the service 
of retention, the tactic of overarching themes helps foster an image of combinato- 
rics as a unified mathematical discipline. While other celebrated examples, e.g., 
Bernoulli numbers, Catalan numbers, and Fibonacci numbers, are generously repre- 
sented, they appear almost entirely in the exercises. For the sake of argument, let us 
stipulate that these roles could just as well have been reversed. The issue is that 
beginning upper division students cannot be expected to absorb, much less appreci- 
ate, all of these special arrays and sequences in a single semester. On the other 
hand, the flexibility is there for willing admirers to rescue one or more of these 
justly famous combinatorial sequences from the relative obscurity of the exercises. 

While the overall framework of the first edition has been retained, everything 
else has been revised, corrected, smoothed, or polished. The focus of many sections 
has been clarified, e.g., by eliminating peripheral topics or moving them to the exer- 
cises. Material new to the second edition includes an optional section on algo- 
rithms, several new examples, and many new exercises, some designed to guide 
students to discover and prove nontrivial results for themselves. Finally, the section 
of hints and answers has been expanded by an order of magnitude. 

The material in Chapter 3, Polya's theory of enumeration, is typically found clo- 
ser to the end of comparable books, perhaps reflecting the notion that it is the last 
thing that should be taught in a junior-level course. The author has aspired, not only 
to make this theory accessible to students taking a first upper division mathematics 
course, but to make it possible for the subject to be addressed right after Chapter 2. 
Its placement in the middle of the book is intended to signal that it can be fitted in 
there, not that it must be. If it seems desirable to cover some but not all of Chapter 3, 
there are many natural places to exit in favor of something else, e.g., after the appli- 
cation of Bell numbers to transitivity in Section 3.3, after enumerating the overall 
number of color patterns in Section 3.5, after stating Polya's theorem in Section 3.6, 
or after proving the theorem at the end of Section 3.6. 

Optional Sections 1.3 and 1.10 can be omitted with the understanding that exer- 
cises in subsequent sections involving probability or algorithms should be assigned 
with discretion. With the same caveat, Section 1.4 can be omitted by those not 
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intending to go on to Sections 6.1, 6.2, or 6.4. The material in Section 6.3, touching 
on mutually orthogonal Latin squares and their connection to finite projective 
planes, can be covered independently of Sections 1.4, 6.1, and 6.2. 

The book contains much more material then can be covered in a single semester. 
Among the possible syllabi for a one semester course are the following: 

• Chapters 1, 2, and 4 and Sections 3.1-3.3 

• Chapters 1 (omitting Sections 1.3, 1.4, & 1.10), 2, and 3, and Sections 5.1 
& 5.2 

• Chapters 1 (omitting Sections 1.3 & 1.10), 2, and 6 and Sections 4.1-4.4 

• Chapters 1 (omitting Sections 1.4 & 1.10) and 2 and Sections 3.1-3.3, 
4.1-4.3, & 6.3 

• Chapters 1 (omitting Sections 1.3 & 1.4) and 2 and Sections 4.1-4.3, 5.1, & 
5.3-5.7 

• Chapters 1 (omitting Sections 1.3, 1.4, & 1.10) and 2 and Sections 4.1-4.3, 
5.1, 5.3-5.5, & 6.3 

Many people have contributed observations, suggestions, corrections, and con- 
structive criticisms at various stages of this project. Among those deserving special 
mention are former students David Abad, Darryl Allen, Steve Baldzikowski, Dale 
Baxley, Stanley Cheuk, Maria Dresch, Dane Franchi, Philip Horowitz, Rhian 
Merris, Todd Mullanix, Cedide Okay, Glenn Orr, Hitesh Patel, Margaret Slack, 
Rob Smedfjeld, and Masahiro Yamaguchi; sometime collaborators Bob Grone, 
Tom Roby, and Bill Watkins; correspondents Mark Hunacek and Gerhard Ringel; 
reviewers Rob Beezer, John Emert, Myron Hood, Herbert Kasube, Andre Kezdy, 
Charles Landraitis, John Lawlor, and Wiley editors Heather Bergman, Christine 
Punzo, and Steve Quigley. I am especially grateful for the tireless assistance of 
Cynthia Johnson and Ken Rebman. 

Despite everyone's best intentions, no book seems complete without some errors. 
An up-to-date errata, accessible from the Internet, will be maintained at URL 

http://www.sci. csuhayward.edu/^rmerris 

Appropriate acknowledgment will be extended to the first person who communi- 
cates the specifics of a previously unlisted error to the author, preferably by 
e-mail addressed to 

merris @ csuhayward.edu 
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The Mathematics of Choice 



It seems that mathematical ideas are arranged somehow in strata, the ideas in each 
stratum being linked by a complex of relations both among themselves and with those 
above and below. The lower the stratum, the deeper (and in general the more difficult) 
the idea. Thus, the idea of an irrational is deeper than the idea of an integer. 

— G. H. Hardy (A Mathematician 's Apology) 

Roughly speaking, the first chapter of this book is the top stratum, the surface layer 
of combinatorics. Even so, it is far from superficial. While the first main result, the 
so-called fundamental counting principle, is nearly self-evident, it has enormous 
implications throughout combinatorial enumeration. In the version presented here, 
one is faced with a sequence of decisions, each of which involves some number of 
choices. It is from situations like this that the chapter derives its name. 

To the uninitiated, mathematics may appear to be "just so many numbers and 
formulas." In fact, the numbers and formulas should be regarded as shorthand 
notes, summarizing ideas. Some ideas from the first section are summarized by 
an algebraic formula for multinomial coefficients. Special cases of these numbers 
are addressed from a combinatorial perspective in Section 1 .2. 

Section 1.3 is an optional discussion of probability theory which can be omitted 
if probabilistic exercises in subsequent sections are approached with caution. 
Section 1.4 is an optional excursion into the theory of binary codes which can be 
omitted by those not planning to visit Chapter 6. Sections 1.3 and 1.4 are partly 
motivational, illustrating that even the most basic combinatorial ideas have real- 
life applications. 

In Section 1.5, ideas behind the formulas for sums of powers of positive integers 
motivate the study of relations among binomial coefficients. Choice is again the 
topic in Section 1.6, this time with or without replacement, where order does or 
doesn't matter. 

To better organize and understand the multinomial theorem from Section 1.7, 
one is led to symmetric polynomials and, in Section 1.8, to partitions of n. 
Elementary symmetric functions and their association with power sums lie at the 
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The Mathematics of Choice 



heart of Section 1.9. The final section of the chapter is an optional introduction to 
algorithms, the flavor of which can be sampled by venturing only as far as 
Algorithm 1.10.3. Those desiring not less but more attention to algorithms can 
find it in Appendix A2. 

1.1. THE FUNDAMENTAL COUNTING PRINCIPLE 




How many different four-letter words, including nonsense words, can be produced 
by rearranging the letters in LUCK? In the absence of a more inspired approach, 
there is always the brute-force strategy: Make a systematic list. 

Once we become convinced that Fig. 1.1.1 accounts for every possible rearran- 
gement and that no "word" is listed twice, the solution is obtained by counting the 
24 words on the list. 

While finding the brute-force strategy was effortless, implementing it required 
some work. Such an approach may be fine for an isolated problem, the like of which 
one does not expect to see again. But, just for the sake of argument, imagine your- 
self in the situation of having to solve a great many thinly disguised variations of 
this same problem. In that case, it would make sense to invest some effort in finding 
a strategy that requires less work to implement. Among the most powerful tools in 
this regard is the following commonsense principle. 

1.1.1 Fundamental Counting Principle. Consider a (finite) sequence of deci- 
sions. Suppose the number of choices for each individual decision is independent 
of decisions made previously in the sequence. Then the number of ways to make the 
whole sequence of decisions is the product of these numbers of choices. 

To state the principle symbolically, suppose c,- is the number of choices for deci- 
sion i. If, for !</<«, c, + i does not depend on which choices are made in 



LUCK 


LUKC 


LCUK 


LCKU 


LKUC 


LKCU 


ULCK 


ULKC 


UCLK 


UCKL 


UKLC 


UKCL 


CLUK 


CLKU 


CULK 


CUKL 


CKLU 


CKUL 


KLUC 


KLCU 


KULC 


KUCL 


KCLU 


KCUL 



Figure 1.1.1. The rearrangements of LUCK. 
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decisions 1 , . . . , i, then the number of different ways to make the sequence of 
decisions is c\ x x • • • x c n . 

Let's apply this principle to the word problem we just solved. Imagine yourself 
in the midst of making the brute-force list. Writing down one of the words involves 
a sequence of four decisions. Decision 1 is which of the four letters to write first, so 
c\ = 4. (It is no accident that Fig. 1.1.1 consists of four rows!) For each way of 
making decision 1, there are c-i = 3 choices for decision 2, namely which letter 
to write second. Notice that the specific letters comprising these three choices 
depend on how decision 1 was made, but their number does not. That is what is 
meant by the number of choices for decision 2 being independent of how the pre- 
vious decision is made. Of course, C3 = 2, but what about C4? Facing no alternative, 
is it correct to say there is "no choice" for the last decision? If that were literally 
true, then C4 would be zero. In fact, C4 = 1. So, by the fundamental counting 
principle, the number of ways to make the sequence of decisions, i.e., the number 
of words on the final list, is 

C1XC2XC3XC4 = 4x3x2x1. 

The product n x (n — 1) x (n — 2) x • • • x 2 x 1 is commonly written n! and 
read n-factorial.* The number of four-letter words that can be made up by rearrang- 
ing the letters in the word LUCK is 4! = 24. 

What if the word had been LUCKY? The number of five-letter words that can be 
produced by rearranging the letters of the word LUCKY is 5! = 120. A systematic 
list might consist of five rows each containing 4! = 24 words. 

Suppose the word had been LOOT? How many four-letter words, including non- 
sense words, can be constructed by rearranging the letters in LOOT? Why not apply 
the fundamental counting principle? Once again, imagine yourself in the midst of 
making a brute-force list. Writing down one of the words involves a sequence of 
four decisions. Decision 1 is which of the three letters L, O, or T to write first. 
This time, c x = 3. But, what about c 2 ? In this case, the number of choices for deci- 
sion 2 depends on how decision 1 was made! If, e.g., L were chosen to be the first 
letter, then there would be two choices for the second letter, namely O or T. If, how- 
ever, O were chosen first, then there would be three choices for the second decision, 
L, (the second) O, or T. Do we take C2 — 2 or C2 — 3? The answer is that the funda- 
mental counting principle does not apply to this problem (at least not directly). 
The fundamental counting principle applies only when the number of choices for 
decision i + 1 is independent of how the previous i decisions are made. 

To enumerate all possible rearrangements of the letters in LOOT, begin by dis- 
tinguishing the two O's. maybe write the word as LOoT. Applying the fundamental 
counting principle, we find that there are 4! = 24 different-Zoo/a'wg four-letter words 
that can be made up from L, O, o, and T. 



The exclamation mark is used, not for emphasis, but because it is a convenient symbol common to most 
keyboards. 
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LOoT LOTo LoOT 

OLoT QLTo OoLT 

oLOT oLTO oOLT 

TLOo TLoQ XO Lo 

Figure 1.1.2. 



LoTO LTOo LToO 

OoTL OTLo OToL 

oOTL oTLO oTOL 

TOoL ToLO ToOL 
igements of LOoT. 



Among the words in Fig. 1.1.2 are pairs like OLoT and oLOT, which look dif- 
ferent only because the two O's have been distinguished. In fact, every word in the 
list occurs twice, once with "big O" coming before "little o", and once the other 
way around. Evidently, the number of different words (with indistinguishable O's) 
that can be produced from the letters in LOOT is not 4! but 4!/2 =12. 

What about TOOT? First write it as TOot. Deduce that in any list of all possible 
rearrangements of the letters T, O, o, and t, there would be 4! = 24 different-look- 
ing words. Dividing by 2 makes up for the fact that two of the letters are O's. Divid- 
ing by 2 again makes up for the two T's. The result, 24/(2 x 2) = 6, is the number 
of different words that can be made up by rearranging the letters in TOOT. Here 
they are 

TTOO TOTO TOOT OTTO OTOT OOTT 

All right, what if the word had been LULL? How many words can be produced 
by rearranging the letters in LULL? Is it too early to guess a pattern? Could the 
number we're looking for be 4!/3 = 8? No. It is easy to see that the correct answer 
must be 4. Once the position of the letter U is known, the word is completely deter- 
mined. Every other position is filled with an L. A complete list is ULLL, LULL, 
LLUL, LLLU. 

To find out why 4!/3 is wrong, let's proceed as we did before. Begin by distin- 
guishing the three L's, say Lj, L 2 , and L 3 . There are 4! different-looking words that 
can be made up by rearranging the four letters L b L 2 , L 3 , and U. If we were to make 
a list of these 24 words and then erase all the subscripts, how many times would, 
say, LLLU appear? The answer to this question can be obtained from the funda- 
mental counting principle! There are three decisions: decision 1 has three choices, 
namely which of the three L's to write first. There are two choices for decision 2 
(which of the two remaining L's to write second) and one choice for the third deci- 
sion, which L to put last. Once the subscripts are erased, LLLU would appear 3! 
times on the list. We should divide 4! = 24, not by 3, but by 3! = 6. Indeed, 
4!/3! — 4 is the correct answer. 

Whoops! if the answer corresponding to LULL is 4!/3!, why didn't we get 4!/2! 
for the answer to LOOT? In fact, we did: 2! = 2. 

Are you ready for MISSISSIPPI? It's the same problem! If the letters were all 
different, the answer would be 11!. Dividing 11! by 4! makes up for the fact that 
there are four l's. Dividing the quotient by another 4! compensates for the four S's. 
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Dividing that quotient by 2! makes up for the two P's. In fact, no harm is done if 
that quotient is divided by 1! = 1 in honor of the single M. The result is 

414^1! = 34 < 65 °- 

(Confirm the arithmetic.) The 11 letters in MISSISSIPPI can be (re)arranged in 
34,650 different ways. 

There is a special notation that summarizes the solution to what we might call 
the "MISSISSIPPI problem." 

7.7.2 Definition. The multinomial coefficient 

( n \ n\ 
\r u r 2 ,...,r k J ~ n\r 2 \- ■ ■ r k V 

where r\ + r 2 + ■ ■ ■ + = n. 

So, "multinomial coefficient" is a name for the answer to the question, how 
many w-letter "words" can be assembled using r\ copies of one letter, r 2 copies 
of a second (different) letter, r$ copies of a third letter, . . . , and r k copies of a 
Mi letter? 



7.7.3 Example. After cancellation, 

/ 9 \ 9x8x7x6x5x4x3x2x1 
V 4, 3, 1,1/ 4x3x2x1x3x2x1x1x1 
= 9x8x7x5 = 2520. 

Therefore, 2520 different words can be manufactured by rearranging the nine letters 
in the word SASSAFRAS. □ 



In real-life applications, the words need not be assembled from the English 
alphabet. Consider, e.g., POSTNET' barcodes commonly attached to U.S. mail 
by the Postal Service. In this scheme, various numerical delivery codes* are repre- 
sented by "words" whose letters, or bits, come from the alphabet ||, | j. Correspond- 
ing, e.g., to a ZIP + 4 code is a 52-bit barcode that begins and ends with |. The 50- 
bit middle part is partitioned into ten 5-bit zones. The first nine of these zones are 
for the digits that comprise the ZIP + 4 code. The last zone accommodates a parity 



This number is roughly equal to the number of members of the Mathematical Association of America 
(MAA), the largest professional organization for mathematicians in the United States. 
f Postal Numeric Encoding Technique. 

*The original five-digit Zoning Improvement Plan (ZIP) code was introduced in 1964; ZIP+4 codes 
followed about 25 years later. The 11 -digit Delivery Point Barcode (DPBC) is a more recent variation. 
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0 = I I I I I 1= Mill 2 = I I I I I 3 = I I I I I 4 = I I I I I 

5 = I I I I I 6 = I I I I I 7 = I I I I I 8 = lllll 9 = I I I I I 

Figure 1.1.3. POSTNET barcodes. 

check digit, chosen so that the sum of all ten digits is a multiple of 10. Finally, each 
digit is represented by one of the 5-bit barcodes in Fig. 1.1.3. Consider, e.g., the ZIP 
+4 code 20090-0973, for the Mathematical Association of America. Because the 
sum of these digits is 30, the parity check digit is 0. The corresponding 52-bit 
word can be found in Fig. 1.1.4. 



I I I I I I II I I I II I I I I I I I I II I I I II I I I I I I I I I I I I I I I II I II I I I I 

20090-0973 
Figure 1.1.4 

We conclude this section with another application of the fundamental counting 
principle. 

7.7.4 Example. Suppose you wanted to determine the number of positive 
integers that exactly divide n = 12. That isn't much of a problem; there are six 
of them, namely, 1, 2, 3, 4, 6, and 12. What about the analogous problem for 
n = 360 or for n = 360, 000? Solving even the first of these by brute-force list 
making would be a lot of work. Having already found another strategy whose 
implementation requires a lot less work, let's take advantage of it. 

Consider 360 = 2 3 x 3 2 x 5, for example. If 360 = dq for positive integers d 
and q, then, by the uniqueness part of the fundamental theorem of arithmetic, the 
prime factors of d, together with the prime factors of q, are precisely the prime 
factors of 360, multiplicities included. It follows that the prime factorization of d 
must be of the form d = 2 a x 3 h x 5 e , where 0<a<3,0<£<2, and 0 < c < 1. 
Evidently, there are four choices for a (namely 0, 1,2, or 3), three choices for b, and 
two choices for c. So, the number of possibile d's is 4 x 3 x 2 = 24. □ 



1.1. EXERCISES 

1 The Hawaiian alphabet consists of 12 letters, the vowels a, e, i, o, u and the 
consonants h, k, I, m, n, p, w. 

(a) Show that 20,736 different 4-letter "words" could be constructed using the 
12-letter Hawaiian alphabet. 
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(b) Show that 456,976 different 4-letter "words" could be produced using the 
26-letter English alphabet.* 

(c) How many four-letter "words" can be assembled using the Hawaiian 
alphabet if the second and last letters are vowels and the other 2 are 
consonants? 

(d) How many four-letter "words" can be produced from the Hawaiian 
alphabet if the second and last letters are vowels but there are no restrictions 
on the other 2 letters? 

2 Show that 

(a) 3! x 5! = 6!. 

(b) 6! x 7! = 10!. 

(c) (n + 1) x («!) = («+ 1)!. 

(d) « 2 = «![l/(»-l)!+l/(«-2)!]. 

(e) m 3 = «![l/(« - 1)! + 3/(n - 2)! + l/(n — 3)!]. 

3 One brand of electric garage door opener permits the owner to select his or her 
own electronic "combination" by setting six different switches either in the 
"up" or the "down" position. How many different combinations are possible? 

4 One generation back you have two ancestors, your (biological) parents. Two 
generations back you have four ancestors, your grandparents. Estimating 2 10 as 
10 3 , approximately how many ancestors do you have 

(a) 20 generations back? 

(b) 40 generations back? 

(c) In round numbers, what do you estimate is the total population of the 
planet? 

(d) What's wrong? 

5 Make a list of all the "words" that can be made up by rearranging the letters in 
(a) TO. (b) TOO. (c) TWO. 

6 Evaluate multinomial coefficient 



Based on these calculations, might it be reasonable to expect Hawaiian words, on average, to be longer 
than their English counterparts? Certainly such a conclusion would be warranted if both languages had the 
same vocabulary and both were equally "efficient" in avoiding long words when short ones are available. 
How efficient is English? Given that the total number of words defined in a typical "unabridged 
dictionary" is at most 350,000, one could, at least in principle, construct a new language with the same 
vocabulary as English but in which every word has four letters — and there would be 100,000 words to 
spare! 
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( « (3,2,1)- m(J.2> »vu,u,u, 

7 How many different "words" can be constructed by rearranging the letters in 
(a) ALLELE? (b) BANANA? (c) PAPAYA? 

(d) BUBBLE? (e) ALABAMA? (f) TENNESSEE? 

(g) HALEAKALA? (h) KAMEHAMEHA? (i) MATHEMATICS? 

8 Prove that 

(a) 1 + 2 + 2 2 + 2 3 + • • • + 2" = 2" +1 - 1. 

(b) 1 x 1! + 2 x 2! + 3 x 3! + • • • + n x n\ = {n + 1)! - 1. 

(c) (2«)!/2" is an integer. 

9 Show that the barcodes in Fig. 1.1.3 comprise all possible five-letter words 
consisting of two |'s and three |'s. 

10 Explain how the following barcodes fail the POSTNET standard: 
(a) 

(b) 
(c) 



1 I I 1 1 I 1 1 1 I 1 1 I 1 I 1 I 1 1 I 1 I 1 I 1 I I 1 1 1 1 1 1 I I I I 1 1 1 1 1 1 I I I 1 I 1 1 
I 1 1 1 I I 1 1 1 1 I 1 I I 1 1 1 1 1 I I I 1 1 1 1 1 I 1 I 1 I I 1 I I 1 1 1 1 I 1 1 1 1 I 1 I 



1 I I 1 1 I 1 1 1 I 1 1 I 1 I 1 I 1 1 I 1 I 1 I 1 I I 1 1 1 1 1 1 I I I I 1 1 1 1 1 1 I I I 1 I 1 1 1 

11 "Read" the ZIP+4 Code 
(a) " ' 



1 1 1 1 I I 1 1 1 1 1 I I I I 1 1 1 1 1 1 I I I 1 I 1 1 I 1 I 1 1 1 1 I I 1 1 I I 1 1 1 I 1 I 1 



(b) 



1 I 1 1 I 1 1 1 I 1 1 I 1 I I I 1 1 1 I 1 1 I 1 I 1 I 1 1 I 1 I 1 1 1 I 1 I 1 I 1 1 I 1 1 1 I I 1 



12 Given that the first nine zones correspond to the ZIP+4 delivery code 94542- 
2520, determine the parity check digit and the two "hidden digits" in the 
62-bit DPBC 

I I 1 I 1 1 1 I 1 1 I 1 I 1 I 1 1 I 1 1 I 1 1 I 1 I 1 1 I 1 I 1 I 1 I 1 1 1 I 1 I I I 1 1 1 I I 1 1 1 1 1 1 I I 1 I I 1 1 I 

(Hint: Do you need to be told that the parity check digit is last?) 

13 Write out the 52-bit POSTNET barcode for 20742-2461, the ZIP+4 code at 
the University of Maryland used by the Association for Women in 
Mathematics. 

14 Write out all 24 divisors of 360. (See Example 1.1.4.) 

15 Compute the number of positive integer divisors of 
(a) 2 10 . (b) 10 10 . (c) 12 10 . (d) 31 10 . 
(e) 360,000. (f) 10!. 
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16 Prove that the positive integer n has an odd number of positive-integer divisors 
if and only if it is a perfect square. 

17 Let D = {d\,d2,d^,d^} and R = {r x , r 2 , r 3 , r 4 , r 5 , r 6 }. Compute the number 

(a) of different functions / : D — > R. 

(b) of one-to-one functions / : D — > R. 

18 The latest automobile license plates issued by the California Department of 
Motor Vehicles begin with a single numeric digit, followed by three letters, 
followed by three more digits. How many different license "numbers" are 
available using this scheme? 

19 One brand of padlocks uses combinations consisting of three (not necessarily 
different) numbers chosen from {0, 1,2, . . . ,39}. If it takes five seconds to 
"dial in" a three-number combination, how long would it take to try all 
possible combinations? 

20 The International Standard Book Number (ISBN) is a 10-digit numerical code 
for identifying books. The groupings of the digits (by means of hyphens) 
varies from one book to another. The first grouping indicates where the book 
was published. In ISBN 0-88175-083-2, the zero shows that the book was 
published in the English-speaking world. The code for the Netherlands is "90" 
as, e.g., in ISBN 90-5699-078-0. Like POSTNET, ISBN employs a check digit 
scheme. The first nine digits (ignoring hyphens) are multiplied, respectively, 
by 10, 9, 8, . . . , 2, and the resulting products summed to obtain S. In 0-88175- 
083-2, e.g., 

5=10x0+9x8+8x8+7x1+6x7+5x5+4x0 
+ 3x8 + 2x3 = 240. 

The last (check) digit, L, is chosen so that S + L is a multiple of 1 1. (In our 
example, L = 2 and S + L = 242 = 1 1 x 22.) 

(a) Show that, when S is divided by 1 1, the quotient Q and remainder R satisfy 
S= UQ + R. 

(b) Show that L= 11 — R. (When R = 1, the check digit is X.) 

(c) What is the value of the check digit, L, in ISBN 0-534-95 154-L? 

(d) Unlike POSTNET, the more sophisticated ISBN system can not 
only detect common errors, it can sometimes "correct" them. Suppose, 
e.g., that a single digit is wrong in ISBN 90-5599-078-0. Assuming 
the check digit is correct, can you identify the position of the erroneous 
digit? 

(e) Now that you know the position of the (single) erroneous digit in part (d), 
can you recover the correct ISBN? 

(f) What if it were expected that exactly two digits were wrong in part (d). 
Which two digits might they be? 
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21 A total of 9! = 362, 880 different nine-letter "words" can be produced by 
rearranging the letters in FULB RIGHT. Of these, how many contain the four- 
letter sequence GRIT? 

22 In how many different ways can eight coins be arranged on an 8x8 
checkerboard so that no two coins lie in the same row or column? 

23 If A is a finite set, its cardinality, o(A), is the number of elements in A. 
Compute 

(a) o(A) when A is the set consisting of all five-digit integers, each digit of 
which is 1, 2, or 3. 

(b) o(B), where B ~ {x £ A : each of 1,2, and 3 is among the digits of x} 
and A is the set in part (a). 



1.2. PASCAL'S TRIANGLE 

Mathematics is the art of giving the same name to different things. 

— Henri Poincare (1854-1912) 

In how many different ways can an r-element subset be chosen from an w-element 
set SI Denote the number by C(n,r). Pronounced "w-choose-r", C(n,r) is just a 
name for the answer. Let's find the number represented by this name. 

Some facts about C(n, r) are clear right away, e.g., the nature of the elements of 
S is immaterial. All that matters is that there are n of them. Because the only way to 
choose an n-element subset from S is to choose all of its elements, C(n,n) = 1. 
Having n single elements, S has n single-element subsets, i.e., C(w, 1) = n. For 
essentially the same reason, C(n,n — 1) = n: A subset of S that contains all but 
one element is uniquely determined by the one element that is left out. Indeed, 
this idea has a nice generalization. A subset of S that contains all but r elements 
is uniquely determined by the r elements that are left out. This natural one-to- 
one correspondence between subsets and their complements yields the following 
symmetry property: 

C(n,n — r) — C(n,r). 

1.2,1 Example. By definition, there are C(5,2) ways to select two elements 
from {A,B, C,D,E}. One of these corresponds to the two-element subset {A,B}. 
The complement of {A, B} is {C, D } E}. This pair is listed first in the following one- 
to-one correspondence between two-element subsets and their three-element 
complements: 
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{A,B} ^ {C,D,E}, {B,D}^{A,C,E} 

{A,C}^{B,D,E}, {B,E}^{A,C,D} 

{A,D} <-> {B, C,E}, {C,D}^{A,B,E} 

{A,E}^{B,C,D}, {C,E} <-» {A,B,D} 

{B,C}^{A,D,E}, {D,E}^{A,B,C}. 

By counting these pairs, we find that C(5, 2) = C(5, 3) = 10. □ 

A special case of symmetry is C(n,0) — C(n,n) = 1. Given n objects, there is 
just one way to reject all of them and, hence, just one way to choose none of them. 
What if n = 0? How many ways are there to choose no elements from the empty 
set? To avoid a deep philosophical discussion, let us simply adopt as a convention 
that C(0,0) = 1. 

A less obvious fact about choosing these numbers is the following. 

1.2.2 Theorem (Pascal's Relation). If ' 1 <r <n, then 

C{n+\,r) = C{n,r-\) + C{n,r). (1.1) 

Together with Example 1.2.1, Pascal's relation implies, e.g., that C(6, 3) = 
C(5,2) + C(5,3) = 20. 

Proof. Consider the (n + 1) -element set {x\,X2, . . . ,x n ,y}. Its r-element subsets 
can be partitioned into two families, those that contain y and those that do not. 
To count the subsets that contain y, simply observe that the remaining r — 1 ele- 
ments can be chosen from {x\,X2, ■ ■ ■ ,x n } in C(n, r— 1) ways. The r-element 
subsets that do not contain y are precisely the r-element subsets of 
{xi,X2, ■ ■ ■ ,x n }, of which there are C(n, r). ■ 

The proof of Theorem 1.2.2 used another self-evident fact that is worth men- 
tioning explicitly. (A much deeper extension of this result will be discussed in 
Chapter 2.) 

1.2.3 The Second Counting Principle. If a set can be expressed as the disjoint 
union of two (or more) subsets, then the number of elements in the set is the sum of 
the numbers of elements in the subsets. 

So far, we have been viewing C(n, r) as a single number. There are some advan- 
tages to looking at these choosing numbers collectively, as in Fig. 1.2.1. The trian- 
gular shape of this array is a consequence of not bothering to write 0 = C(n, r), 
r > n. Filling in the entries we know, i.e., C(w,0) = C(n,n) = 1, C(n, 1) = n = 
C(n,n~ 1), C(5,2) = C(5,3) = 10, and C(6,3) = 20, we obtain Fig. 1.2.2. 
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Figure 1.2.1. C(n, r). 

Given the fourth row of the array (corresponding to n = 3), we can use Pascal's 
relation to compute C(4, 2) = C(3, 1) + C(3, 2) = 3 + 3 = 6. Similarly, C(6, 4) = 
C(6, 2) = C(5, 1) + C(5, 2) = 5 + 10 = 15. Continuing in this way, one row at a 
time, we can complete as much of the array as we like. 





r 


0 


1 


2 


3 


4 


5 


6 


7 


n 




















0 




















1 






1 














2 






2 


1 












3 






3 


3 


1 










4 






4 


C(4,2) 


4 


1 








5 






5 


10 


10 


5 


1 






6 






6 


C(6,2) 


20 


C(6,4) 


6 


1 




7 






7 


C(7,2) 


C(7,3) 


C(7,4) 


0(7,5) 


7 


1 



Figure 1.2.2 



Following Western tradition, we refer to the array in Fig. 1.2.3 as Pascal's 
triangle* (Take care not to forget, e.g., that C(6, 3) = 20 appears, not in the third 
column of the sixth row, but in the fourth column of the seventh!) 

Pascal's triangle is the source of many interesting identities. One of these con- 
cerns the sum of the entries in each row: 



1 + 1 = 2, 
1+2+1=4, 
1 +3+3+ 1 = 
1+4+6+4- 



(1.2) 



1 



16, 



After Blaise Pascal (1623-1662), who described it in the book Traite du triangle arithmetique. Rumored 
to have been included in a lost mathematical work by Omar Khayyam (ca. 1050-1130), author of the 
Rubaiyat, the triangle is also found in surviving works by the Arab astronomer al-Tusi (1265), the Chinese 
mathematician Chu Shih-Chieh (1303), and the Hindu writer Narayana Pandita (1365). The first European 
author to mention it was Petrus Apianus (1495-1552), who put it on the title page of his 1527 book, 
Rechnung. 
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Figure 1.2.3. Pascal's triangle. 



and so on. Why should each row sum to a power of 2? In 

n 

C(n, 0) + C(n, 1) + • • • + C(n, n) = C(n, r), 

r=0 

C(w,0) is the number of subsets of S — {x\,%2, ■ ■ ■ ,x n } that have no elements; 
C(n, 1) is the number of one-element subsets of S; C(n,2) is the number of 
two-element subsets, and so on. Evidently, the sum of the numbers in row n of 
Pascal's triangle is the total number of subsets of S (even when n = 0 and 
5 = 0). The empirical evidence from Equations (1.2) suggests that an w-element 
set has a total of 2" subsets. How might one go about proving this conjecture? 

One way to do it is by mathematical induction. There is, however, another 
approach that is both easier and more revealing. Imagine youself in the process 
of listing the subsets of S = {x\,X2, . . ■ ,x n }. Specifying a subset involves a 
sequence of decisions. Decision 1 is whether to include x\ . There are two choices, 
Yes or No. Decision 2, whether to put x-i into the subset, also has two choices. 
Indeed, there are two choices for each of the n decisions. So, by the fundamental 
counting principle, S has a total of2x2x---x2 = 2" subsets. 

There is more. Suppose, for example, that n = 9. Consider the sequence of deci- 
sions that produces the subset {x2,X3,X6,xg}, a sequence that might be recorded as 
NYYNN YN YN. The first letter of this word corresponds to No, as in "no to X\ " ; the 
second letter corresponds to Yes, as in "yes to X2"; because X3 is in the subset, the 
third letter is Y; and so on for each of the nine letters. Similarly, {xi,X2,Xi} cor- 
responds to the nine-leter word YYYNNNNNN. In general, there is a one-to-one 
correspondence between subsets of {xi,X2, . . . ,x n }, and w-letter words assembled 
from the alphabet {N,Y}. Moreover, in this correspondence, r-element subsets 
correspond to words with r Y's and n — r N's. 

We seem to have discovered a new way to think about C(n, r). It is the number 
of «-letter words that can be produced by (re)arranging r Y's and n — r N's. This 
interpretation can be verified directly. An n-letter word consists of n spaces, or loca- 
tions, occupied by letters. Each of the words we are discussing is completely deter- 
mined once the r locations of the Y's have been chosen (the remaining n — r spaces 
being occupied by N's). 
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The significance of this new perspective is that we know how to count the num- 
ber of M-letter words with r Y's and n - r N's. That's the MISSISSIPPI problem! 
The answer is multinomial coefficient ( "_ r ) . Evidently, 



C(n, r) = 



n 



r,n-rj r\(n — r)\' 



For things to work out properly when r = 0 and r = n, we need to adopt another 
convention. Define 0! = 1. (So, 0! is not equal to the nonsensical 0 x (0 — 1) x 
(0-2) x ••• x 1.) 

It is common in the mathematical literature to write ( " ) instead of ( n _ r ) , one 
justification being that the information conveyed by "n — r" is redundant. It can be 
computed from n and r. The same thing could, of course, be said about any multi- 
nomial coefficient. The last number in the second row is always redundant. So, that 
particular argument is not especially compelling. The honest reason for writing ( " ) 
is tradition. 

We now have two ways to look at C(n, r) = ( " ) . One is what we might call the 
combinatorial definition: n-choose-r is the number of ways to choose r things from 
a collection of n things. The alternative, what we might call the algebraic definition, 
is 

n\ 

C ( M ' r ) = ~T( \\- 

r\[n — r)\ 

Don't make the mistake of asuming, just because it is more familiar, that the 
algebraic definition will always be easiest. (Try giving an algebraic proof of the 
identity YT r =a r ) = 2 B .) Some applications are easier to approach using alge- 
braic methods, while the combinatorial definition is easier for others. Only by 
becoming familiar with both will you be in a position to choose the easiest 
approach in every situation! 

7.2.4 Example. In the basic version of poker, each player is dealt five cards (as 
in Fig. 1.2.4) from a standard 52-card deck (no joker). How many different five-card 
poker hands are there? Because someone (in a fair game it might be Lady Luck) 
chooses five cards from the deck, the answer is C(52, 5). The ways to find the num- 
ber behind this name are: (1) Make an exhaustive list of all possible hands, (2) work 
out 52 rows of Pascal's triangle, or (3) use the algebraic definition 



C(52,5) 



52! 
5! 47! 

52 x 51 x 50 x 49 x 48 x 47! 

5x4x3x2x1x47! 
52 x 51 x 50 x 49 x 48 

5x4x3x2x1 
52 x 51 x 10 x 49 x 2 

2,598,960. n 
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15 



4 



Figure 1.2.4. A five-card poker hand. 



7.2.5 Example. The game of bridge uses the same 52 cards as poker. The 
number of different 13-card bridge hands is 



C(52, 13) = 



52! 

13139! 

52 x 51 x 



x 40 x 39! 



13! x 39! 
52 x 51 x ••• x 40 

13! : 



about 635,000,000,000. 



□ 



It may surprise you to learn that C(52, 13) is so much larger than C(52, 5). On 
the other hand, it does seem clear from Fig. 1.2.3 that the numbers in each row of 
Pascal's triangle increase, from left to right, up to the middle of the row and then 
decrease from the middle to the right-hand end. Rows for which this property holds 
are said to be unimodal. 

1.2.6 Theorem. The rows of Pascal's triangle are unimodal. 



*The actual, physical cards are typically slimmer to accommodate the larger, 13-card hands. 
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Proof. If n > 2r + 1 , the ratio 

C(n,r+1) r\(n — r)\ n — r 

C{n,r) ~ (r+ l)!(n-r- 1)! ~ r+ 1 

implying that C(n, r + 1) > C(n, r). 



> 1, 



1.2. EXERCISES 

1 Compute 

(a) C(7,4). (b) C(10,5). (c) C(12,4). 

(d) C(101,2). (e) C(101,99). (f) C(12,6). 

2 If m and r are integers satisfying n > r > 0, prove that 

(a) (r+l)C{n,r+l) = (n-r)C{n,r). 

(b) (r+ l)C(n,r+ 1) =nC(n- l,r). 

3 Write out rows 7 through 10 of Pascal's triangle and confirm that the sum of 
the numbers in the 10th row is 2 10 = 1024. 

4 Consider the sequence of numbers 0, 0, 1, 3, 6, 10, 15, ... from the third 
(r = 2) column of Pascal's triangle. Starting with n = 0, the rath term of the 
sequence is a n — C(n, 2). Prove that, for all n > 0, 

(a) a n+ \ -a n = n. (b) a n+x + a„ = n 2 . 

5 Consider the sequence bo, b\, &2, • • • , where b n = C(n, 3). Prove that, for 
all n > 0, 

(a) b n+l -b n = C(n,2). 

(b) ft„+2 — fo„ is a perfect square. 

6 Poker is sometimes played with a joker. How many different five-card poker 
hands can be "chosen" from a deck of 53 cards? 

7 Phrobana is a game played with a deck of 48 cards (no aces). How many 
different 12-card phrobana hands are there? 

8 Give the inductive proof that an w-element set has 2" subsets. 

9 Let r, be a positive integer, 1 < i < k. If n = r\ + r 2 H h r k , prove that 

n \ _ / n- 1 \ / n-1 \ 
n,r 2 ,...,r k ) \n-l,r 2 ,...,r k ) \ r u r 2 - 1, . . . , r k ) 



+ 



n- 1 

r u r 2 ,...,r k - 1 
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(a) using algebraic arguments. 

(b) using combinatorial arguments. 

10 Suppose n, k, and r are integers that satisfy n > k > r > 0 and k > 0. Prove 



(a) C(n, k)C(k, r) = C(n, r)C(n — r,k — r). 

(b) C(n, k)C{k, r) = C(n, k - r)C(n - k + r,r). 

(c) E;=o C(n,j)C(j,r) = C(n,r)2"- r . 

(d) E]= k (-iy +k C(n,j) = C{n - \,k-\). 

11 Prove that [£" =0 C(n, r)] 2 = E.f=o C(2n, 

12 Prove that C(2n,n), n > 0, is always even. 

13 Probably first studied by Leonhard Euler (1707-1783), the Catalan sequence* 
1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862,... is defined by c n = C(2n,n)/ 
(n + 1), n > 0. Confirm that the Catalan numbers satisfy 

(a) c 2 = 2ci. (b) c 3 = 3c 2 - c,. 

(c) C4 = 4c3 — 3c2. (d) C5 = 5c4 — 6C3 + c 2 . 

(e) C6 = 6c 5 — 10q + 4c3. (f) c 7 = 7c6 — 15cs + IOC4 — C3. 

(g) Speculate about the general form of these equations. 

(h) Prove or disprove your speculations from part (g). 

14 Show that the Catalan numbers (Exercise 13) satisfy 

(a) c„ = C(2n- l,n- 1) - C(2« - l,n+ 1). 

(b) c n = C(2n,n) — C(2n,n — 1). 



15 One way to illustrate an r-element subset 5 of {1, 2, ... ,n} is this: Let Po be 
the origin of the xy-plane. Setting xq = yo = 0, define 



Finally, connect successive points by unit segments (either horizontal or 
vertical) to form a "path". Figure 1.2.5 illustrates the path corresponding to 

S= {3,4,6,8} and n = 8. 



that 



(c) c„+i 



4w + 2 
n + 2 




keS, 
k^S. 



*Euler was so prolific that more than one topic has come to be named for the first person to work on it after 
Euler, in this case, Eugene Catalan (1814-1894). 
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Figure 1.2.5 



(a) Illustrate E = {2, 4, 6, 8} when n = 8. 

(b) Illustrate E = {2, 4, 6, 8} when n = 9. 

(c) Illustrate D = {1,3,5,7} when n = 8. 

(d) Show that P„ = (r, n — r) when 5 is an r-element set. 

(e) A lattice path of length n in the xy-plane begins at the origin and consists 
of n unit "steps" each of which is either up or to the right. If r of the steps 
are to the right and s = n — r of them are up, the lattice path terminates at 
the point (r,s). How many different lattice paths terminate at (r,s)7 

16 Define Co = 1 and let c n be the number of lattice paths of length 2n 
(Exercise 15) that terminate at (n,n) and never rise above the line y = x, 
i.e., such that > for each point Pj. = (xk,yk)- Show that 

(a) ci = 1, C2 = 2, and C3 = 5. 

(b) c„+i = YT r =o c rC n -r- (Hint: Lattice paths "touch" the line y = x for the 
last time at the point («, n). Count those whose next-to-last touch is at the 
point (r, r)). 

(c) c n is the rath Catalan number of Exercises 13-14, n > 1. 

17 Let X and Y be disjoint sets containing n and m elements, respectively. In how 
many different ways can an (r + ^)-element subset Z be chosen from X U Y if 
r of its elements must come from X and s of them from Yl 

18 Packing for a vacation, a young man decides to take 3 long-sleeve shirts, 
4 short-sleeve shirts, and 2 pairs of pants. If he owns 16 long-sleeve shirts, 
20 short-sleeve shirts, and 13 pairs of pants, in how many different ways can 
he pack for the trip? 
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Figure 1.2.6 
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Suppose « is a positive integer and let £ = 


[m/2J , the greatest integer not larger 




than n/2. Define 










F„ = 


C(«,0) + 


C(n- 1,1) + C(n 


-2,2)4 


- • • • + C(n — k,k). 



Starting with n = 0, the sequence {F„} is 

1, 1,2, 3, 5, 8, 13, ... , 



where, e.g., the 1th number in the sequence, = 13, is computed by 
summing the boldface numbers in Fig. 1.2.6. 

(a) Compute Fq directly from the definition. 

(b) Prove the recurrence F n+ 2 — F n+ \ + F„, n > 0. 

(c) Compute F 7 using part (b) and the initial fragment of the sequence given 
above. 

(d) Prove that £"=o F t = F n+2 - 1. 

20 C. A. Tovey used the Fibonacci sequence (Exercise 19) to prove that infinitely 
many pairs («, k) solve the equation C(n, k) = C(n — l,k+ 1). The first pair is 
C(2, 0) = C(l, 1). Find the second. (Hint: n < 20. Your solution need not 
make use of the Fibonacci sequence.) 

21 The Buda side of the Danube is hilly and suburban while the Pest side is flat 
and urban. In short, Budapest is a divided city. Following the creation of a new 
commission on culture, suppose 6 candidates from Pest and 4 from Buda 
volunteer to serve. In how many ways can the mayor choose a 5-member 
commission. 



Tt was the French number theorist Francois Edouard Anatole Lucas (1842-1891) who named these 
numbers after Leonardo of Pisa (ca. 1 180-1250), a man also known as Fibonacci. 
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(a) from the 10 candidates? 

(b) if proportional representation dictates that 3 members come from Pest and 
2 from Buda? 

22 H. B. Mann and D. Shanks discovered a criterion for primality in terms of 
Pascal's triangle: Shift each of the n + 1 entries in row n to the right so that 
they begin in column 2m. Circle the entries in row n that are multiples of n. 
Then r is prime if and only if all the entries in column r have been circled. 
Columns 0-1 1 are shown in Fig. 1.2.7. Continue the figure down to row 9 and 
out to column 20. 

v r0 1 23456789 10 11 

CD CD 

1 (2) 1 

1 (3) © 1 

10 6 0 
1 0 

Figure 1.2.7 



1 

2 
3 
4 
5 



23 The superintendent of the Hardluck Elementary School District suggests that 
the Board of Education meet a $5 million budget deficit by raising average 
class sizes, from 30 to 36 students, a 20% increase. A district teacher objects, 
pointing out that if the proposal is adopted, the potential for a pair of 
classmates to get into trouble will increase by 45%. What is the teacher 
talking about? 

24 Strictly speaking, Theorem 1.2.6 establishes only half of the unimodality 
property. Prove the other half. 

25 If n and r are nonnegative integers and x is an indeterminate, define 

K(n,r) = (1 +x)V. 

(a) Show that K(n + 1, r) = K(n, r) + K(n, r+l). 

(b) Compare and contrast the identity in part (a) with Pascal's relation. 

(c) Since part (a) is a polynomial identity, it holds when numbers are 
substituted for x. Let k(n, r) be the value of K(n, r) when x = 2 and 
exhibit the numbers lc(n,r), 0 < n, r < 4, in a 5 x 5 array, the rows of 
which are indexed by n and the columns by r. {Hint: Visually confirm that 

k(n + 1, r) = k(n, r) + k(n, r + 1), 0 < n, r < 3.) 
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26 Let S be an w-element set, where n > 1. If A is a subset of S, denote by o(A) 
the cardinality of (number of elements in) A. Say that A is odd (even) if o(A) is 
odd (even). Prove that the number of odd subsets of S is equal to the number of 
its even subsets. 

27 Show that there are exactly seven different ways to factor n = 63,000 as a 
product of two relatively prime integers, each greater than one. 

28 Suppose n — p^'p^ 2 ■ ■ ■ p" r , where p\,p2, ■ ■ ■ ,Pr are distinct primes. Prove that 
there are exactly 2 r_1 — 1 different ways to factor n as a product of two 
relatively prime integers, each greater than one. 



*1.3. ELEMENTARY PROBABILITY 

The theory of probabilities is basically only common sense reduced to calculation; it 
makes us appreciate with precision what reasonable minds feel by a kind of instinct, 
often being unable to account for it. ... It is remarkable that [this] science, which 
began with the consideration of games of chance, should have become the most impor- 
tant object of human knowledge. 

— Pierre Simon, Marquis de Laplace (1749-1827) 

Elementary probability theory begins with the consideration of D equally likely 
"events" (or "outcomes"). If N of these are "noteworthy", then the probability 
of a noteworthy event is the fraction N/D. Maybe a brown paper bag contains a 
dozen jelly beans, say, 1 red, 2 orange, 2 blue, 3 green, and 4 purple. If a jelly 
bean is chosen at random from the bag, the probability that it will be blue is 
Yi = \\ the probability that it will be green is j2 — \\ the probability that it will 
be blue or green is (2 + 3)/12 = ^; and the probability that it will be blue and 
green is yj = 0. 

Dice are commonly associated with games of chance. In a dice game, one is 
typically interested only in the numbers that rise to the top. If a single die is rolled, 
there are just six outcomes; if the die is "fair", each of them is equally likely. In 
computing the probability, say, of rolling a number greater than 4 with a single fair 
die, the denominator is D = 6. Since there are N — 2 noteworthy outcomes, namely 
5 and 6, the probability we want is P = | = j. 

The situation is more complicated when two dice are rolled. If all we care about 
is their sum, then there are 11 possible outcomes, anything from 2 to 12. But, the 
probability of rolling a sum, say, of 7 is not jj because these 1 1 outcomes are not 
equally likely. To help facilitate the discussion, assume that one of the dice is green 
and the other is red. Each time the dice are rolled, Lady Luck makes two decisions, 
choosing a number for the green die, and one for the red. Since there are 6 choices 
for each of them, the two decisions can be made in any one of 6 2 = 36 ways. If both 
dice are fair, then each of these 36 outcomes is equally likely. Glancing at Fig. 1.3.1, 
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Red die 



10 



10 11 



11 



12 



Figure 1.3.1. The 36 outcomes of rolling two dice. 



one sees that there are six ways the dice can sum to 7, namely, a green 1 and a red 6, 
a green 2 and a red 5, a green 3 and a red 4, and so on. So, the probability of rolling 
a (sum of) 7 is not but 4 = g • 

7.3.7 Example. Denote by P(n) the probability of rolling (a sum of) n with 
two fair dice. Using Fig. 1.3.1, it is easy to see that P(2) = ^ = P(12), 
P(3) = ^ = jL = p(n) ) p(4) = A = i = p( 10 ), and so on. What about P(l)? 
Since 1 is not among the outcomes, P(l) = ^ = 0. In fact, if P is some probability 



(any probability at all), then 0 < P < 1. 



□ 



1.3.2 Example. A popular game at charity fundraisers is Chuck-a-Luck. The 
apparatus for the game consists of three dice housed in an hourglass-shaped 
cage. Once the patrons have placed their bets, the operator turns the cage and the 
dice roll to the bottom. If none of the dice comes up 1, the bets are lost. Otherwise, 
the operator matches, doubles, or triples each wager depending on the number of 
"aces" (l's) showing on the three dice. 

Let's compute probabilities for various numbers of l's. By the fundamental 
counting principle, there are 6 3 = 216 possible outcomes (all of which are equally 
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Figure 1.3.2. Chuck-a-Luck probabilities. 



likely if the dice are fair). Of these 216 outcomes, only one consists of three l's. 
Thus, the probability that the bets will have to be tripled is jyz. 

In how many ways can two l's come up? Think of it as a sequence of two deci- 
sions. The first is which die should produce a number different from 1. The second 
is what number should appear on that die. There are three choices for the first deci- 
sion and five for the second. So, there are 3 x 5 = 15 ways for the three dice to 
produce exactly two l's. The probability that the bets will have to be doubled is j^. 

What about a single ace? This case can be approached as a sequence of three 
decisions. Decision 1 is which die should produce the 1 (three choices). The second 
decision is what number should appear on the second die (five choices, anything but 
1). The third decision is the number on the third die (also five choices). Evidently, 
there are 3 x 5 x 5 = 75 ways to get exactly one ace. So far, we have accounted for 
1 + 15 + 75 = 91 of the 216 possible outcomes. (In other words, the probability of 
getting at least one ace is ^.) In the remaining 216 — 91 = 125 outcomes, three 
are no l's at all. These results are tabulated in Fig. 1.3.2. □ 

Some things, like determining which team kicks off to start a football game, are 
decided by tossing a coin. A fair coin is one in which each of the two possible out- 
comes, heads or tails, is equally likely. When a fair coin is tossed, the probability 
that it will come up heads is \. 

Suppose four (fair) coins are tossed. What is the probability that half of them 
will be heads and half tails? Is it obvious that the answer is |? Once again, Lady 
Luck has a sequence of decisions to make, this time four of them. Since there are 
two choices for each decision, D = 2 4 . With the noteworthies in boldface, these 16 
outcomes are arrayed in Fig. 1.3.3. By inspection, N = 6, so the probability we seek 



HHHH HTHH THHH TTHH 

HHHT HTHT THHT TTHT 

HHTH HTTH THTH TTTH 

HHTT HTTT THTT TTTT 



Figure 1.3.3 

1.3.3 Example. If 10 (fair) coins are tossed, what is the probability that half of 
them will be heads and half tails? Ten decisions, each with two choices, yields 
D = 2 10 = 1024. To compute the numerator, imagine a systematic list analogous 
to Fig. 1.3.3. In the case of 10 coins, the noteworthy outcomes correspond to 
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10-letter "words" with five H's and five T's, so N = ( 5 10 5 ) = C(10,5) = 252, and 
the desired probability is j§^ = 0.246. More generally, if n coins are tossed, the 
probability that exactly r of them will come up heads is C(n, r)/2 n . 

What about the probability that at most r of them will come up heads? That's 

easy enough: P = N/2", where N = N(n, r) = C(n, 0) + C(n, 1) H h C(n, r) 

is the number of n-letter words that can be assembled from the alphabet {H, T} 
and that contain at most r H's. □ 

Here is a different kind of problem: Suppose two fair coins are tossed, say a 
dime and a quarter. If you are told (only) that one of them is heads, what is the 
probability that the other one is also heads? (Don't just guess, think about it.) 

May we assume, without loss of generality, that the dime is heads? If so, because 
the quarter has a head of its own, so to speak, the answer should be \. To see why 
this is wrong, consider the equally likely outcomes when two fair coins are tossed, 
namely, HH, HT, TH, and TT. If all we know is that one (at least) of the coins is 
heads, then TT is eliminated. Since the remaining three possibilities are still equally 
likely, D = 3, and the answer is i. 

There are two "morals" here. One is that the most reliable guide to navigating 
probability theory is equal likelihood. The other is that finding a correct answer 
often depends on having a precise understanding of the question, and that requires 
precise language. 

1.3.4 Definition. A nonempty finite set E of equally likely outcomes is called a 
sample space. The number of elements in E is denoted o(E). For any subset A of E, 
the probability of A is P(A) = o(A) /o(E). If B is a subset of E, then P(A or B) = 
P(A U B), and P(A and B) = P(A n B). 

In mathematical writing, an unqualified "or" is inclusive, as in "A or B or both" .* 

1.3.5 Theorem. Let E be a fixed but arbitrary sample space. If A and B are 
subsets of E, then 

P(A or B) = P{A) + P(B) - P(A and B). 

Proof. The sum o(A) + o(B) counts all the elements of A and all the elements of 
B. It even counts some elements twice, namely those in A n B. Subtracting o(A n B) 
compensates for this double counting and yields 

o(A U B) = o(A) + o(B) - o(A n B). 

(Notice that this formula generalizes the second counting principle; it is a 
special case of the even more general principle of inclusion and exclusion, to be 
discussed in Chapter 2.) It remains to divide both sides by o(E) and use 
Definition 1.3.4. 



*The exclusive "or" can be expressed using phrases like "either A or B" or "A or B but not both" 
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1.3.6 Corollary. Let E be a fixed but arbitrary sample space. If A and B are 
subsets of E, then P(A or B) < P(A) + P(B) with equality if and only if A and B 
are disjoint. 

Proof P(A and B) = 0 if and only if o(A n B) = 0 if and only if A n B = 0. 

■ 

A special case of this corollary involves the complement, A c = {x S E : x $ A}. 
Since AUA C =E and A(1A C =0, o{A) + o(A c ) = o(E). Dividing both sides of 
this equation by o(E) yields the useful identity 

P(A)+P(A C ) = 1. 

1.3.7 Example. Suppose two fair dice are rolled, say a red one and a green one. 
What is the probability of rolling a 3 on the red die, call it a red 3, or a green 2? 
Let's abbreviate by setting R3 = red 3 and G2 = green 2 so that, e.g., 
P(J?3)=i = P(G2). 

Solution 1: When both dice are rolled, only one of the 6 2 = 36 equally likely 
outcomes corresponds to R3 and G2, so P(R3 and G2) = ^. Thus, by Theorem 
1.3.5, 

P(R3 or G2) = P(R3) + P(G2) - P(R3 and G2) 

6 T 6 36 
_ 11 
36 ' 

Solution 2: Let P c be the complementary probability that neither R3 nor G2 
occurs. Then P c = N/D, where D = 36. The evaluation of TV can be viewed in 
terms of a sequence of two decisions. There are five choices for the "red" decision, 
anything but number 3, and five for the "green" one, anything but number 2. 
Hence, TV = 5 x 5 = 25, and P c = ||, so the probability we want is 

P(R3 or G2) = 1 - P c = ±± . D 



1.3.8 Example. Suppose a single (fair) die is rolled twice. What is the probabil- 
ity that the first roll is a 3 or the second roll is a 2? Solution: |g. This problem is 
equivalent to the one in Example 1.3.7. □ 

1.3.9 Example. Suppose a single (fair) die is rolled twice. What is the probabil- 
ity of getting a 3 or a 2? 

Solution 1: Of the 6 x 6 = 36 equally likely outcomes, 4 x 4 = 16 involve 
neither a 3 nor a 2. The complementary probability is P(2 or 3) = 1 — 1| = |. 
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Solution 2: There are two ways to roll a 3 and a 2; either the 3 comes first fol- 

2_ _ J_ 
36 18' 



lowed by the 2 or the other way around. So, P (3 and 2) = A = i Using Theorem 



1.3.5, P(3or2)=I + I-i = ^. 

Whoops! Since | ^ ^, one (at least) of these "solutions" is incorrect. The prob- 
ability computed in solution 1 is greater than |, which seems too large. On the other 
hand, it is not hard to spot an error in solution 2, namely, the incorrect application of 
Theorem 1.3.5. The calculation P(3) = g would be valid had the die been rolled 
only once. For this problem, the correct interpretation of P(3) is the probability 
that the first roll is 3 or the second roll is 3. That should be identical to the prob- 
ability determined in Example 1.3.8. (Why?) Using the (correct) values 
P(3) = P{2) = i| in solution 2, we obtain P(2 or 3) = |± + ±± - ^ = §. 

The next time you get a chance, roll a couple of dice and see if you can avoid 
both 2's and 3's more than 44 times out of 99. □ 

Another approach to P(A and B) emerges from the notion of "conditional 
probability". 

1.3.10 Definition. Let £ be a fixed but arbitrary sample space. If A and B are 
subsets of E, the conditional probability 



P(B\A) 



P{B) if A = 0, 

o{A n B) /o(A) otherwise. 



When A is not empty, P(B\A) may be viewed as the probability of B given that A 
is certain (e.g., known already to have occurred). The problem of tossing two fair 
coins, a dime and a quarter, involved conditional probabilities. If h and t represent 
heads and tails, respectively, for the dime and H and T for the quarter, then the 
sample space E = {hH, hT, tH, tT}. If A = {tiH,hT,tH} and B = {hH}, then 
P(B\A) — | is the probability that both coins are heads given that one of them is. 
If C = {hH, liT}, then P(B\C) = \ is the probability that both coins are heads given 
that the dime is. 

1.3.11 Theorem. Let E be a fixed but arbitrary sample space. If A and B are 
subsets of E, then 

P(A and B) =P(A)P(B\A). 



Proof. Let D = o(E), a = o(A), and N = o(A n B). If a = 0, there is nothing to 
prove. Otherwise, P(A) = a/D, P(B\A) = N/a, and P(A)P(B\A) = (a/D)(N/a) = 
N/D = P(A and B). ■ 

1.3.12 Corollary (Bayes's* First Rule). Let E be a fixed but arbitrary sample 
space. If A and B are subsets of E, then P(A)P(B\A) = P(B)P(A\B). 



Proof. Because P(A and B) = P(B and A), the result is immediate from Theorem 
1.3.11. ■ 
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1.3.13 Definition. Suppose £ is a fixed but arbitrary sample space. Let A and B 
be subsets of E. If P(B\A) = P(B), then A and B are independent. 

Definitions like this one are meant to associate a name with a phenomenon. In 
particular, Definition 1.3.13 is to be understood in the sense that A and B are inde- 
pendent if and only if P(B\A) = P(B). (In statements of theorems, on the other 
hand, "if" should never be interpreted to mean "if and only if".) 

In plain English, A and B are independent if A = 0 or if A ^ 0 and the prob- 
ability of B is the same whether A is known to have occurred or not. It follows from 
Corollary 1.3.12 (and the definition) that P{B\A) = P(B) if and only if 
P(A\B) = P(A), i.e., A and B are independent if and only if B and A are indepen- 
dent. A combination of Definition 1.3.13 and Theorem 1.3.11 yields 

P(A and B) = P(A)P(B) (1.3) 

if and only if A and B are independent. 

Equation (1.3) is analogous to the case of equality in Corollary 1.3.6, i.e., 
that 

P(A or B) =P(A)+ P(B) ( 1 .4) 

if and only if A and B are disjoint. Let's compare and contrast the words indepen- 
dent and disjoint. 

1.3.14 Example. Suppose a card is drawn from a standard 52-card deck. Let K 
represent the outcome that the card is a king and C the outcome that it is a club.^" 
Because P(C) = || = \ = P(C\K), these outcomes are independent and, as 
expected, 

P(K)P(C) = (±)($ 

_ j_ 

— 52 

= P(king of clubs) 
= P(K and C). 

Because K n C = {king of clubs} ^ 0, K and C are not disjoint. As expected, 
P(K or C) = if differs from P(K) + P(C) = |> + i = ibyi = P(K and C). 

If Q is the outcome that the card is a queen, then K and Q are disjoint but not 
independent. In particular, P(K or Q) = ^ = P{K) + P(Q), but P(Q) = ^ = i 
while P(Q\K) = 0. 



'Thomas Bayes (1702-1761), an English mathematician and clergyman, was among those who defended 
Newton's calculus against the philosophical attack of Bishop Berkeley. He is better known, however, for 
his Essay Towards Solving a Problem in the Doctrine of Chances. 

^Alternatively, let £ be the set of all 52 cards, K the four-element subset of kings, and C the subset of all 13 
clubs. 
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Finally, let F be the outcome that the chosen card is a "face card" (a king, 
queen, or jack). Because K n F = K ^ 0, outcomes K and F are not disjoint. Since 
P(F) = 1| = ^ while P(F|^T) = 1, neither are they independent. □ 

1.3.15 Example. Imagine two copy editors independently proofreading the 
same manuscript. Suppose editor X finds x typographical errors while editor Y finds 
y. Denote by z the number of typos discovered by both editors so that, together, they 
identify a total of x + y — z errors. George Polya showed how this information can 
be used to estimate the number of typographical errors overlooked by both editors! 
If the manuscript contains a total of t typos, then the empirical probability that 
editor X discovered (some randomly chosen) one of them is P(X) = x/t. If, on 
the other hand, one of the errors discovered by Y is chosen at random, the empirical 
probability that X found it is P(X\Y) — z/y. If X is a consistent, experienced 
worker, these two "productivity ratings" should be about the same. Setting 
z/y = x/t (i.e., assuming P(X\Y) = P(X)) yields t = xy/z. □ 

1.3.16 Example. In the popular game Yahtzee, five dice are rolled in hopes of 
obtaining various outcomes. Suppose you needed to roll three 4's to win the 
game. What is the probability of rolling exactly three 4's in a single throw of the 
five dice? 

Solution: There are C(5, 3) = 10 ways for Lady Luck to choose three dice to be 
the 4's, e.g., the "first" three dice might be 4's while the remaining two are not; 
dice 1, 2, and 5 might be 4's while dice 3 and 4 are not; and so on. Label these 
ten outcomes A\,Ai, . . . ,Aio. 

The computation of P(A\\ say, is a classic application of Equation (1.3). The 
probability of rolling a 4 on one die is independent of the number rolled on any 
of the other dice. Since the probability that any one of the first three dice shows 
a 4 is g and the probability that either one of the last two does not is |, 

WHxgXgxfxf- 

Similarly, P{A t ) = (g) 3 (|) 2 , 2 < i < 10. 
If, e.g., 

A\ = {dice 1,2, and 3 are 4's while dice 4 and 5 are not} 

and 

A3 = {dice 1,2, and 5 are 4's while dice 3 and 4 are not}, 



Tn a 1976 article published in the American Mathematical Monthly. 
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then the third die is a 4 in every outcome belonging to A\ while it is anything but a 4 
in each outcome of A3, i.e., A\ n A3 = 0. Similarly, A, and Aj are disjoint for all 
i 7^ j. Therefore, by Equation (1.4), 

P(three 4's) = P(A\ or A 2 or ... or A w ) 

= P(A,) + P(A 2 ) + • • • + P{A W ) 

= ioa) 3 (i) 2 - 

So, the probability of rolling exactly three 4's in a single throw of five fair dice 
is 

C(5,3)(i) 3 (!) 2 = 0.032... . D 

Example 1.3.16 illustrates a more general pattern. The probability of rolling 
exactly r 4's in a single throw of n fair dice is C(n, r) (g) r (g)" ' '• If a single fair 
die is thrown n times, the probability of rolling exactly r 4's is the same: 
C(n, r) (g) r (g)" ' '• A similar argument applies to n independent attempts to perform 
any other "trick". If the probability of a successful attempt is p, then the probability 
of an unsuccessful attempt is q — 1 — p, and the probability of being successful in 
exactly r of the n attempts is 

P( r ) = C(n, r)p r q"- r , 0 < r < n. (1.5) 

Equation (1.5) governs what has come to be known as a binomial probability 
distribution. 



1.3. EXERCISES 

1 According to an old adage, it is unsafe to eat shellfish during a month whose 
name does not contain the letter R. What is the probability that it is unsafe to 
eat shellfish (according to the adage) during a randomly chosen month of the 
year? 

2 Suppose two fair dice are rolled. What is the probability that their sum is 
(a) 5? (b) 6? (c) 8? (d) 9? 



3 Suppose three fair dice are rolled. What is the probability that their sum is 
(a) 5? (b) 9? (c) 12? (d) 15? 
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4 Suppose a fair coin is tossed 10 times and the result is 10 successive 
heads. What is the probability that heads will be the outcome the next 
time the coin is tossed? (If you didn't know the coin was fair, you might 
begin to suspect otherwise. The chi-squared statistic, which is beyond the 
scope of this book, affords a way to estimate the probability that a fair coin 
would produce discrepancies from expected behavior that are this bad or 
worse.) 

5 Many game stores carry dodecahedral dice having 12 pentagonal faces 
numbered 1-12. Suppose a pair of fair dodecahedral dice are rolled. What is 
the probability that they will sum to 

(a) 5? (b) 7? (c) 13? (d) 25? 

6 In what fraction of six-child families are half the children girls and half boys? 
(Assume that boys and girls are equally likely.) 

7 Suppose you learn that in a particular two-child family one (at least) of the 
children is a boy. What is the probability that the other child is a boy? (Assume 
that boys and girls are equally likely.) 

8 Suppose the king and queen of hearts are shuffled together with the king and 
queen of spades and all four cards are placed face down on a table. 

(a) If your roommate picks up two of the cards and says, "I have a king!" 
what is the probability that s/he has both kings? (Don't just guess. Work it 
out as if your life depended on getting the right answer.) 

(b) If your roommate picks up two of the cards and says, "I have the king of 
spades", what is the probability that s/he has both kings? 

9 In the Chuck-a-Luck game of Example 1.3.2, show how the fundamental 
counting principle can be used to enumerate the outcomes that don't contain 
any l's at all. 

10 Suppose that six dice are tossed. What is the probability of rolling exactly 
(a) three 4's? (b) four 4's? (c) five 4's? 

11 Suppose that five cards are chosen at random from a standard 52-card deck. 
Show that the probability they comprise a "flush" is about (A flush is a 
poker hand each card of which comes from the same suit.) 

12 Suppose some game of chance offers the possibility of winning one of a 
variety of prizes. Maybe there are n prizes with values vi, V2, • • • , v„. If the 
probability of winning the i'th prizes is p it then the expected value of the game 
is 

n 

i=\ 

Consider, e.g., a version of Chuck-a-Luck in which, on any given turn, you win 
$1 for each ace. 
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(a) Show that the expected value of this game is 50 cents. (Hint: Figure 1.3.2.) 

(b) What is the maximum amount anyone should be willing to pay for the 
privilege of playing this version each time the cage is turned? 

(c) What is the maximum amount anyone should be willing to wager on this 
version each time the cage is turned? (The difference between "paying for 
the privilege of playing" and "wagering" is that, in the first case, your 
payment is lost, regardless of the outcome, whereas in the second case, 
you keep your wager unless the outcome is no aces at all.) 

13 Does Chuck-a-Luck follow a binomial probability distribution? (Justify your 
answer.) 

14 Suppose four fair coins are tossed. Let A be the set of outcomes in which at 
least two of the coins are heads, B the set in which at most two of the coins are 
heads, and C the set in which exactly two of the coins are heads. Compute 
(a) P{A). (b) P(B). (c) P(C). 

(d) P(A\B). (d) P(A\C). (e) P(A or B). 

15 In 1654, Antoine Gombaud, the Chevalier de Mere, played a game in which 
he bet that at least one 6 would result when four dice are rolled. What is the 
probability that de Mere won in any particular instance of this game? (Assume 
the dice were fair.) 

16 Perhaps beause he could no longer find anyone to take his bets (see 
Exercise 15), the Chevalier de Mere switched to betting that, in any 24 
consecutive rolls of two (fair) dice, "boxcars" (double 6's) would occur at 
least once. What is the probability that he won in any particular instance of this 
new game? 

17 Suppose you toss a half-dollar coin n times. How large must n be to guarantee 
that your probability of getting heads at least once is better than 0.99? 

18 The following problem was once posed by the diarist Samuel Pepys to Isaac 
Newton. "Who has the greatest chance of success: a man who throws six dice 
in hopes of obtaining at least one 6; a man who throws twelve dice in hopes of 
obtaining at least two 6's; or a man who throws eighteen dice in hopes of 
obtaining at least three 6's?" Compute the probability of success in each of the 
three cases posed by Pepys. 

19 Are P(A\B) and P(B\A) always the same? (Justify your answer.) 

20 Suppose that each of k people secretly chooses as integer between 1 and m 
(inclusive). Let P be the probability that some two of them choose the same 
number. Compute P (rounded to two decimal places) when 

(a) (m,k) = (10,4) (b) (m,k) = (20,6) (c) {m,k) = (365,23) 

(Hint: Compute the complementary probability that everyone chooses differ- 
ent numbers.) 
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21 Suppose 23 people are chosen at random from a crowd. Show the probability 
that some two of them share the same birthday (just the day, not the day and 
year) is greater than |. (Assume that none of them was born on February 29.) 

22 Let £ be a fixed but arbitrary sample space. Let A and B be nonempty subsets 
of E. Prove that A and B cannot be both independent and disjoint. 

23 The four alternate die numberings illustrated in Fig. 1.3.4 were discovered by 
Stanford statistician Bradley Efron. Note that when dice A and B are thrown 
together, die A beats (rolls a higher number than) die B with probability | 
Compute the probability that 

(a) die B beats die C. 

(b) die C beats die D. 

(c) die D beats die A. 
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Figure 1.3.4. Efron dice. 



24 One variation on the notion of a random walk takes place in the first quadrant 
of the xy-plane. Starting from the origin, the direction of each "step" is 
determined by the flip of a coin. If the feth coin flip is "heads", the feth step is 
one unit in the positive x-direction; if the coin flip is "tails", the step is one 
unit in the positive y-direction. 

(a) Show that, after n steps, a random walker arrives at a point P n = (r,n — r), 
where n > r > 0. (Hint: Exercise 15, Section 1.2.) 

(b) Assuming the coin is fair, compute the probability that the point 

^8 = (4,4). 

(c) Assuming the coin is fair, compute the probability that P-j lies on the line 

y = x. 

(d) Assuming the coin is fair, compute the probability that P^k lies on the line 
y = x. 

(e) Let r and n be fixed integers, n > r > 0. Assuming the coin comes up 
heads with probability p and tails with probability q = 1 — p, compute the 
probability that, after n steps, a random walker arrives at the point 
P n = {r,n- r). 
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25 Imagine having been bitten by an exotic, poisonous snake. Suppose the ER 
physician estimates that the probability you will die is I unless you receive 
effective treatment immediately. At the moment, she can offer you a choice of 
experimental antivenins from two competing "snake farms." Antivenin X has 
been administered to ten previous victims of the same type of snake bite and 
nine of them survived. Antivenin Y, on the other hand, has only been 
administered to four previous patients, but all of them survived. Unfortunately, 
mixing the two drugs in your body would create a toxic substance much 
deadlier than the venom from the snake. Under these circumstances, which 
antivenin would you choose, and why? 

26 In California's SuperLotto Plus drawing of February 16, 2002, three winners 
shared a record $193 million jackpot. SuperLotto Plus players choose five 
numbers, ranging from 1 through 47 Plus a "Mega" number between 1 and 27 
(inclusive). The winning numbers in the drawing of February 16 were 6, 11, 
31, 32, and 39 Plus 20. (Order matters only to the extent that the Mega number 
is separate from the other five numbers.) 

(a) Compute the probability of winning a share of the jackpot (with a single 
ticket). 

(b) The jackpot is not the only prize awarded in the SuperLotto game. In 
the February 16 drawing, 56 tickets won $27,859 (each) by matching all 
five (ordinary) numbers but missing the Mega number. Compute the 
probability of correctly guessing all five (ordinary) numbers. 

(c) Compute the probability of correctly guessing all five (ordinary) numbers 
and missing the Mega number. 

(d) In the February 16 drawing, 496 ticket holders won $1572 (each) by 
correctly guessing the Mega number and four out of the other five. 
Compute the probability of winning this prize (with a single ticket). 
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00000000000 

1111111111 

0 0 0 0 0 1 1 1 1 1 0 

1 1 1 1 1 0 0 0 0 0 
0 10 10 10 10 10 

10 10 10 10 10 
00000000000 

1111111111 

0 10 10 10 10 10 
10 10 10 10 10 

11111111111 

The key to the connection between the combinatorial and algebraic definitions of 
C{n,r) = (") involves n-letter words constructed from two-letter alphabets. A 
binary code is a vocabulary comprised of such words. Binary codes have a wide 
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variety of applications ranging from stunning interplanetary images to everyday 
digital recordings. A common theme in these applications is the reliable movement 
of data through unreliable communication channels. The general problem is to 
detect and correct transmission errors that might arise from something as mundane 
as scratches on a CD to something as exotic as solar flares during an interplanetary 
voyage. 

Our primary focus will be on words assembled using the alphabet F = {0, 1}, 
the letters of which are typically called bits. 

1.4.1 Definition. An w-bit word is also known as a binary word of length n. The 
set of all 2" binary words of length n will be denoted F". A binary code of length n 
is a nonempty subset of F n . 

A "good" code is one that can be used to transmit lots of information down a 
noisy channel, quickly and reliably. Consider, e.g., the code ^ = {00000, 11111} C 
F 5 , where 00000 might represent "yes" and 11111 might mean "no." Suppose one 
of these two codewords is sent down a noisy channel, only to have 000_0, or worse, 
00010 come out the other end. While it is a binary word of length 5, 00010 is not a 
codeword. Thus, we detect an error. Just to make things interesting, suppose no 
further communication is possible. (Maybe the original message consisted of a sin- 
gle prerecorded burst.) Assuming it is more likely for any particular bit to be trans- 
mitted correctly than not, 00000 is more likely to have been the transmitted 
message than 11111. Thus, we might correct 00010 to 00000. Note that a binary 
word "corrected" in this way need not be correct in the sense that it was the trans- 
mitted codeword. It is just the legitimate codeword most likely to be correct. 

7.4.2 Definition. Suppose b and w are binary words of length n. The distance 
between them, d(b, w), is the number of places in which they differ. 

Nearest-neighbor decoding refers to a process by which an erroneous binary 
word w is corrected to a legitimate codeword c in a way that minimizes d(w,c). 
With the code <<f = {00000, 11111}, it is possible to detect as many as four errors. 
With nearest-neighbor decoding, it is possible (correctly) to correct as many as two; 
^ is a two-error-correcting code. (If 00000 were sent and 10101 received, nearest- 
neighbor decoding would produce 1 1 1 1 1, the wrong message, Code c £ is not three- 
error correcting.) 

1.4.3 Definition. An r-error-correcting code is one for which nearest-neighbor 
decoding reliably corrects as many as r errors. 

Using the code ^ = {100, 101}, suppose 100 is sent. If 110 is received, an error 
is detected. Because d(110, 100) = 1 < 2 = <sf( 1 10, 101) nearest-neighbor decod- 
ing corrects 110 to 100, the correct message. But, this is not enough to make 
a one-error-correcting code. If 100 is sent and a single transmission error occurs, 
in the third bit, so that 101 is received, the error will not even be detected, much 



1.4. Error-Correcting Codes 



35 



less corrected. An r-error-correcting code must reliably correct r erroneous bits, no 
matter which r bits they happen to be. 

Calling d a "distance" doesn't make it one. To be a distance , d(b, w) should be 
zero whenever b = w, positive whenever b ^ w, symmetric in the sense that 
dip, w) = d(w, b) for all b and w, and it should satisfy the shortest-distance- 
between-two-points rule, also known as the triangle inequality. Of these conditions, 
only the last one is not obviously valid. 

1.4.4 Lemma (Triangle Inequality). If u, v, and w are fixed but arbitrary 
binary words of length n, then 

d(u, w) < d(u, v) + d(v, w). 

Proof. The words u and w cannot differ from each other in a place where neither 
of them differs from v. Being binary words, they also cannot differ from each other 
in a place where both of them differ from v. It follows that d(u, w) is the sum of the 
number of places where u differs from v but w does not, and the number of places 
where w differs from v but u does not. Because the first term in this sum is at most 
d(u,v), the number of places where u differs from v, and the second is at most 
d(w, v), the number of places where w differs from v, d(u, w) < d(u, v) + d(w, v). 

m 

1.4.5 Definition. An (n,M, d) code consists of M binary words of length n, the 
minimum distance between any pair of which is d. 

1.4.6 Example. The code {00000, 11111}, is evidently a (5, 2, 5) code. While 
it is easy to see that n = 5 and M = 4 for the code <€ = {00000, 11101, 10011, 
OHIO}, the value of d is less obvious. Computing the distances cf(00000, 
11101) =4, d(00000, 10011) = 3, ^(00000, 01 110) = 3, d(11101, 10011) = 3, 
^(11101, OHIO) = 3, and ^(10011,01110) =4, between all C(4,2) = 6 pairs of 
codewords, yields the minimum d = 3. So, ^ is a (5, 4, 3) code. □ 

An (n, M, d) code c £ can reliably detect as many as d — 1 errors. To determine 
how many erros ( £ can reliably correct, consider the possibility that, for some erro- 
neous binary word w, there is a tie for the codeword nearest w. Maybe d(c, w) > r 
for every ce^, with equality for c\ and C2- In practice, such ties are broken by 
some predetermined rule. Because it can happen that this arbitrary rule dictates 
decoding w as Cj, even when c 2 was the transmitted codeword, no such code can 
reliably correct as many as r errors. However, by the triangle inequality, 
d(c\ , w) = d(w, C2) = r implies that d(c\,C2) <2r, guaranteeing that no such 
situation can occur when 2r < d. It seems we have proved the following. 

1.4.7 Theorem. An (n,M,d) code is r-error-correcting if and only if 2r+ 
1 < d. 
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Recall that our informal notion of a good code is one that can transmit lots of 
information down a noisy channel, quickly and reliably. So far, our discussion has 
focused on reliability. Let's talk about speed. For the sake of rapid transmission, one 
would like to have short words (small n) and a large vocabulary (big M). Because 
M < 2", these are conflicting requirements. 

Suppose we fix n and d and ask how large M can be. The following notion is 
useful in addressing this question. 

1.4.8 Definition. Let w be a binary word of length n. The sphere of radius r cen- 
tered at w is 

S r (w) = {b G F" : d(w 7 b)<r}, 
the set of binary words that differ from w in at most r bits. 

Because it is a sphere together with its interior, "ball" might be a more appro- 
priate name for S r (w). 

1.4.9 Example. Let ^ be a (10,M, 7) code and suppose c Because there 
are 10 places in which a binary word can differ from c, there must be 10 binary 
words that differ from c in just 1 place. Similarly, C(10, 2) =45 words differ 
from c in exactly 2 places and C(10,3) = 120 words differ from it in 3 places. 
Evidently, including c itself, 53(c) contains a total of 

1 + 10 + 45 + 120 = 176 

binary words only one of which, namely, c, is a codeword. 

If c\ and C2 are different codewords, then S^{c\) (~1 53(02) ^ 0 only if there is a 
binary word w such that d(w,c 1 ) < 3 and d(w, C2) < 3, implying that 

d{c\ ,c 2 ) < d(c x , w) + d{w, c 2 ) 
< 6 

and contradicting our assumption that the minimum distance between codewords 
is 7. In other words, if c t ^ c 2 , then 53(01) (~l 53(02) = 0. 

One might think of 53(c) as a sphere of influence for c. Because different spheres 
of influence are disjoint and since each sphere contains 176 of the 1024 binary 
words of length 10, there is insufficient room in F lQ for as many as six spheres 
of influence. (Check it: 6 x 176 = 1056.) Evidently, the vocabulary of a three- 
error-correcting binary code of length 10 can consist of no more than five words! 
If <€ is a (10, M, 1) code, then M < 5. □ 

Example 1.4.9 has the following natural generalization. 

1.4.10 Theorem (Sphere-Packing Bound). The vocabulary of an r-error- 
correcting code of length n contains no more than 2"/N(n,r) codewords, where 



N(n,r) = C(n,0) + C(n, !) + ••• + C(n, r). 
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Proof. Suppose ^ = {ci, c 2 , . . . , Cm} C F n is an r-error-correcting code. Let 
S r (ci) be the sphere of influence centered at codeword c,-, 1 < i < M. Since spheres 
corresponding to different codewords are disjoint and o(5 r (c,-)) = Af(«, r), 
1 < i < M, the number of different binary words of length n contained in the union 
of the M spheres is M x N(n, r), a number that cannot exceed the total number of 
binary words of length n. ■ 

1.4.11 Example. Suppose you were asked to design a three-error-correcting 
code capable of sending the four messages NORTH, EAST, WEST, or SOUTH. 
Among the easiest solutions is the (16, 4, 8) code 

{0000000000000000, 1111111 100000000, 1 1 1 100001 1 1 10000, 1 1 1 1000000001 111}. 

However, if speed (or professional pride) is an issue, you might want to hold this 
one in reserve and look for something better. 

For a solution to be optmal, it should (at the very least) be an (n, 4, 7) code with n 
as small as possible. According to Example 1 .4.9, a three-error-correcting code of 
length 10 can have at most five codewords, which would be ample for our needs. 
Moreover, because 4 x N(9, 3) = 4 x (1 + 9 + 36 + 84) = 520 > 2 9 , there can be 
no (9, 4, 7) codes. So, the best we can hope to achieve is a (10, 4, 7) code. 

Without loss of generality, we can choose c\ — 0000000000. (Why?) Since it 
must differ from c\ in (no fewer than) 7 places, we may as well let 
c 2 = 1111111000. To differ from c\ in 7 places, c 3 must contain 7 (or more) l's. 
But, C3 can differ from c 2 in 7 places only if (at least) four of its first seven bits are 
0's! It is, of course, asking too much of a 10-bit word that it contain at least four 0's 
and at least seven 1 's. The same problem arises no matter which seven bits are set 
equal to 1 in c 2 , and setting more than seven bits equal to 1 only makes matters 
worse! It seems there do not exist even three binary words of length 10 each differ- 
ing from the other two in (at least) seven bits. (Evidently, the sphere-packing bound 
is not always attainable!) 

If there are no (10, 3, 7) codes, there certainly cannot be any (10, 4, 7) codes. 
What about an (11, 4, 7) code? This time, the obvious choices, c\ = 00000000000 
and c 2 = 11111110000, leave room for c 3 = 00001111111, which differs from c 2 
in eight places and from c\ in seven. Because c\ = 11110001111 differs from c 2 
and C3 in seven places and from c\ in eight, c £ = {ci,c 2 , 03,04} is an (11, 4, 7) code. 

□ 

Our discovery, in Example 1.4.11, that M < 2 in any (10, M, 7) code is a little 
surprising. Because a sphere of radius 3 in F m holds (only) 176 words, two non- 
overlapping spheres contain little more than a third of the 1024 words in F 1Q ! On the 
other hand, how many solid Euclidean balls of radius 3 will fit inside a Euclidean 
cube of volume 1024? 

*Even in the familiar world of three-dimensional Euclidean space, sphere-packing problems can be highly 
nontrivial. On the other hand, in at least one sense, packing Euclidean spheres in three-space is a bad 
analogy. Orange growers are interested in sphere packing because, without damaging the produce, they 
want to minimize the fraction of empty space in each "full" box of oranges. Apart from degenerate cases, 
equality is never achievable in the grower's version of the sphere-packing bound. 
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Figure 1.4.1. Three-dimensional binary space. 

1.4.12 Example. As illustrated in Fig. 1.4.1, three-dimensional binary space F 3 
is comparable, not to a Euclidean cube, but to the set consisting of its eight vertices! 
While packing the Euclidean cube with Euclidean spheres always results in "left- 
over" Euclidean points, F 3 is easily seen to be the disjoint union of the spheres 
Si (000) = {000, 100,010,001} and Si(lll) = {111,011, 101, 110}. (Note the 
two different ways in which Si (111) is "complementary" to Si (000).) □ 

1.4.13 Definition. An (n,M,d) code is perfect if 2" = M x [C(n, 0)+ 
C(n, 1) + • • • + C(n, r)], where r = [(d — 1)/2J is the greatest integer not exceed- 
ing (d - l)/2. 

So, an r-error-correcting code ^ is perfect if and only if its vocabulary achieves 
the sphere-packing bound, if and only if F" is the disjoint union of the spheres S r (c) 
as c ranges over c €, if and only if every binary word of length n belongs to the 
sphere of influence of some (unique) codeword. In particular, a perfect code is as 
efficient as it is possible for codes to be. 

It follows from Definition 1.4.13 that F", itself, is perfect. It is the disjoint union 
of the (degenerate) spheres So(b), b G F n . Such trivial examples are uninteresting 
for a number of reasons, not the least of which is that F n cannot detect, much 
less correct, even a single error. A nontrivial perfect code emerges from 
Example 1.4.12, namely, the one-error-correcting (3,2,3) code {000, 111}. Might 
this be the only nontrivial example? No, {100, 011} is another. All right, might the 
only nontrivial examples have parameters (3,2,3)? 

1.4.14 Lemma. Suppose is an (n, M, d) code for which r — [(d — 1)/2J = 1. 
Then ( 4 is perfect if and only if there exists an integer m>2 such that n — 2 m — 1 
and M = 2"- m . 

Proof. If <g is perfect, then 2" = M x N(n, 1) = M(l + n), so that M = 2"/ 
(1 + n). Now, 1 + n exactly divides 2" only if 1 + n = 2 m for some positive integer 

'Because one vertex is hidden from view, "seen" may not be the most appropriate word to use here. 
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m < n, in which case M = 2" /2 m = 2"- m . Moreover, 2 m - 1 = n > d > 3 implies 
m > 2. 

Conversely, if n = 2 m — 1 and M = 2"~ m , then M( 1 + n) = 2"- m x 2 m = 2" . ■ 

7.4.75 Example. The parameters of the perfect (3, 2, 3) code # = {000, 111} 
satisfy the conditions of Lemma 1.4.14 when m = 2. 

Setting d = 3 and m = 3 in Lemma 1.4.14 shows that every (7, 16, 3) code is 
perfect. What it does not show is the existence of even one (7, 16, 3) code! How- 
ever, as the reader may confirm, (7, 16, 3) is the triple of parameters for the so-called 
Hamming code ,3f 3 = {0000000, 100001 1 , 0100101 , 00101 10, 0001 1 1 1 , 1 1001 10, 
1010101, 1001100, 0110011, 0101010, 0011001, 0111100, 1011010, 1101001, 
1 1 10000, 1111111}. In Chapter 6, the existence of an (n, M, 3) code that satisfies 
the conditions of Lemma 1.4.14 will be established for every m > 4. □ 

1.4. EXERCISES 

1 What is the largest possible value for M in any (8,M, 1) code? 

2 How many errors can an (n,M, 8) code 
(a) detect? (b) correct? 

3 Find the parameters (n,M,d) for the binary code 

(a) #1 = {000,011,101,110}. 

(b) # 2 = {000,011, 101, 110, 111, 100,010,001}. 

(c) ^3 = {0000,0110, 1010, 1100, 1111, 1001,0101,0011}. 

(d) # 4 = {11000,00011,00101,00110,01001,01010,01100, 10001, 10010, 
10100}, (Compare ^ 4 with the POSTNET barcodes of Fig. 1.1.3.) 

4 Construct a code (or explain why none exists) with parameters 
(a) (3, 4, 2). (b) (6, 4, 4). (c) (12, 4, 8). 

(d) (4, 7, 2). (e) (8, 7, 4). (f) (8, 8, 4). 

5 The American Standard Code for Information Interchange (ASCII) is a scheme 
for assigning numerical values from 0 through 255 to selected symbols. For 
example, the uppercase letters of the English alphabet correspond to 65 through 
90, respectively. Why 256 symbols? Good question. The answer involves bits 
and bytes. Consisting of two four-bit "zones", a byte can store any binary 
numeral in the range 0 through 255. 

Apart from representing binary numerals, bytes can also be viewed as 
codewords in r £ — F & . Because it corresponds to the base-2 numeral for 65, 
the codeword/byte 01000001 represents A (in the ASCII scheme). Similarly, Z, 
corresponding to 90, is represented by the codeword/byte 01011010. 

(a) What is the ASCII number for the letter SI 

(b) What byte represents S7 
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(c) What letter corresponds to ASCII number 76? 

(d) What letter is represented by codeword/byte 01010101? 

(e) The ASCII number for the square-root symbol is 251. What codeword/byte 
represents ? 

(f) Decode the message 01001101-01000001-01010100-01001000. 

6 The complement of a binary word b is the word b* obtained from b by 
changing all if its zeros to ones and all of its ones to zeros. For any binary code 
<€> define T = {c':c^}. 

(a) Show that ?2 = *iU where #i and ^2 are the codes in Exercises 3(a) 
and (b), respectively. 

(b) Find a code # of length 3 satisfying c £* = F 3 \<£, the set-theoretic 
complement of c €. {Hint: Example 1.4.12.) 

(c) Find a code # of length 3 satisfying T = <6. 

(d) If % is an (n, M, d) code, prove or disprove that "if* has the same parameters. 

(e) If c £ is an (n, M, d) code, prove or disprove that F n \€ has the same parameters. 

7 The weight of a binary word b, wt(fo), is the number of bits of /? equal to 1. A 
constant-weight code is one in which every codeword has the same weight. 

(a) Show that d > 2 in any constant-weight (n,M,d) code (in which n > 1). 

(b) Find a constant-weight (8,M,<f) code with d > 2. 

(c) Find the largest possible value for M in a constant-weight (8, M, d) code. 

8 Let # be the (8, 56, 2) code consisting of all binary words of length 8 and 
weight 5. (See Exercise 7.) Let 1>* be the code consisting of the complements 
of the codewords of < €. (See Exercise 6.) Prove that ( £\J c {o* is an (8, 112, 2) 
code. 

9 M. Plotkin* proved that if n < 2d in the (n, M, d) code then M < 
2\dj(2d — ri)\, where [*| is the greatest integer not larger that x. Does the 
Plotkin bound preclude the existence of 

(a) (12, 52, 5) codes? (b) (12, 7, 7) codes? 

(c) (13, 13, 7) codes? (d) (15, 2048, 3) codes? 

(Justify your answers.) 

10 Does the sphere-packing bound (Theorem 1.4.10) rule out the existence of a 
(a) (12, 52, 5) code? (b) (12, 7, 7) code? 
(c) (13, 13, 7) code? (d) (15, 2048, 3) code? 
(Justify your answers.) 



'Binary codes with specified minimum distances, IEEE Trans. Info. Theory 6 (1960), 445-450. 
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11 The purpose of this exercise is to prove the Plotkin bound from Exercise 9. Let 

= {ci, C2, ■ ■ ■ , c M } be an (n,M, d) code where n < 2d. Define 

M 

D = ^2 d{cj,Cj). 

ij=l 

(a) Prove that D > M(M - \)d. 

(b) Let A be the M x n (0, l)-matrix whose ith row consists of the bits of 
codeword q. If the Mi column of A contains zt 0's (and M — z k l's), prove 
that 

ft 

D = 2^ Z ,(M- Zt ). 

(fc=l 

(c) If M is even, show that/(z) — z(M — z) is maximized when z = \M. 

(d) Prove the Plotkin bound in the case that M is even. 

(e) If M is odd, show that D < \n(M 2 - 1). 

(f) Prove the Plotkin bound in the case that M is odd. 

(g) Where is the hypothesis n < 2d used in the proof? 

12 The parity of binary word b is 0 if wt(£>) is even and 1 if wt(/?) is odd. (See 
Exercise 7.) If b = xy . . . z is a binary word of length n and parity p, denote by 
b + = xy . . .zp the binary word of length n + 1 obtained from b by appending a 
new bit equal to its parity. For any binary code c -€ of length n, let 

f = {c+:c£f}. 

(a) Show that ^3 = where ^ 2 and ^3 are the codes from Exercises 3(b) 
and (c), respectively. 

(b) If c £ is an (n,M,d) code, where d is odd, prove that c £ + is an 
(m + l,M,d + 1) code. 

(c) Prove that exactly half the words in F" have parity p = 0. 

(d) Prove or disprove that if ^ is a fixed but arbitrary binary code of length n, 
then exactly half the words in <<f have even weight. 

13 Let M(n, d) be the largest possible value of M in any (w, M, d) code. Prove that 
M(n,2r- 1) =M(n+ l,2r). 

14 If ^ is a code of length «, its "weight enumerator" is the two-variable 
polynomial defined by 

w«(x 7 y) = j2 xWt{c) y"~ m{c) > 

where wt(c) is the weight of c defined in Exercise 7. 

(a) Compute W<g(x, y) for each of the codes in Exercise 3. 
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(b) Show that W<$(x,y) = x 1 + 7x 4 y 3 + 7x 3 y 4 + y 1 for the perfect Hamming 
code <6 = ^f 3 of Example 1.4.15. 

(c) Two codes are equivalent if one can be obtained from the other by 
uniformly permuting (rearranging) the order of the bits in each codeword. 
Show that equivalent codes have the same parameters. 

(d) Show that equivalent codes have the same weight enumerator. 

(e) Exhibit two inequivalent codes with the same weight enumerator. 

15 Exhibit the parameters for the perfect Hamming code ,#4 (corresponding to 
m — 4 in Lemma 1.4.14). 

16 Show that the Plotkin bound (Exercise 9) is strong enough to preclude the 
existence of a (10, 3, 7) code (see Example 1.4.11). 

17 Can the (11, 4, 7) code in Example 1.4.11 be extended to an (11, 5, 7) code? 

18 Let u, v, and w be binary words of length n. Show that d(u, w) — d(u,v)+ 
d(v, w) — 2b, where b is the number of places in which u and w both differ from v. 

19 Following up on the discussion between Examples 1.4.11 and 1.4.12, show 
that two solid Euclidean spheres of radius 3 cannot be fit inside a cubical box 
of volume 1024 in such a way that both spheres touch the bottom of the box. 

20 Show that the necessary condition for the existence of an r-error-correcting 
code given by the sphere-packing bound is not sufficient. 

21 Let M(n,d) be the largest possible value of M in any (n,M,d) code. 

(a) If n > 2, prove that M{n, d) < 2M(n -l,d). 

(b) Prove that M(2d, d) < Ad. 

22 Show that a necessary condition for equality to hold in the Plotkin bound 

(Exercises 9 and 11) is d(ci,cj) = d, i ^ j. 

23 The (7, 16, 3) code Jf^ in Example 1.4.15 is advertised as a perfect code. 
While it is easy to check that ,^3 is a binary code of length 7 containing 16 
codewords, (given what we know now) it might take a minite or two to confirm 
that the minimum distance between any two codewords is 3. Assuming that 
has been done, how hard is it to confirm that ,3^3 is a perfect code? (Justify 
your answer by providing the confirmation.) 

24 Let A = F 3 \Si(110) the (set-theoretic) complement of Si ( 1 1 0) in ^ 3 . 

(a) Show that A is a sphere in F 3 . 

(b) Do A and Si (110) exhibit both kinds of complementarity discussed in 
Exercise 6? 

25 Prove that every (23, 4096, 7) code is perfect. 

26 Construct a code with parameters (8, 16, 4). 

27 Construct a code with parameters 
(a) (6, 8, 3). (b) (7, 8, 4). 
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28 The purpose of this exercise is to justify nearest-neighbor decoding. We begin 
with some assumptions about the transmission channel. The simplest case is a 
so-called symmetric channel in which the probability of a 1 being changed to 0 
is the same as that of a 0 being changed to 1. If we assume this common error 
probability, call it p, is the same for each bit of every word, then q = 1 — p is 
the probability that any particular bit is transmitted correctly. 

(a) Show that the probability of transmitting codeword c and receiving binary 
word w along such a channel is p r q n ~ r , where r is the number of places in 
which c and w differ. 

(b) Under the assumption that p < \ (engineers work very hard to ensure that 
p is much less than |, show that the probability in part (a) is maximized 
when r is as small as possible. 

29 Suppose the two-error-correcting code ^={00000,11111} is used in a 
symmetric channel for which the probability of a transmission error in each 
bit is p = 0.05. (See exercise 28.) 

(a) Show that the probability of more than two errors in the transmission of a 
single codeword is less than 0.0012. 

(b) There may be cases in which a probability of failure as high as 0.0012 is 
unacceptable. What is the probability of more than three errors in the 
transmission of a single codeword using the same channel and the code 
{0000000, 1111111}? 



1.5. COMBINATORIAL IDENTITIES 

Poetry is the art of giving different names to the same thing. 

— Anonymous 

As we saw in Sectionl.2, C(n,r) = (") is the same as multinomial coefficient 
( rn "_ r ). In fact, C(n,r) is commonly called a binomial coefficient. Given that 
binomial coefficients are special cases of multinomial coefficients, it is natural to 
wonder whether we still need a separate name and notation for w-choose-r. On the 
other hand, it turns out that multinomial coefficients can be expressed as products of 
binomial coefficients. Thus, one could just as well argue for discarding the multi- 
nomial coefficients! 

1.5.1 Theorem. If r\ + r 2 H h r k = n, then 

( n \ _ / n\ ( n - r\ \ ( n - r\ - r 2 \ ( n - r\ - r 2 r k -\ \ 

\n,r 2 ,...,r k J \ri)\ r 2 )\ r } J \ r k )' 



This name is thought to have been coined by Michael Stifel (ca. 1485-1567), among the most celebrated 
algebraists of the sixteenth century. Also known for numerological prophesy, Stifel predicted publicly that 
the world would end on October 3, 1533. 
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Proof. Multinomial coefficient ( r] r " ^ ) is the number of w-letter "words" that 
can be assembled using r\ copies of one "letter", say A\\ r 2 copies of a second, A 2 ; 
and so on, finally using r k copies of some A:th character, A k . The theorem is proved 
by counting these words another way and setting the two (different-looking) 
answers equal to each other. 

Think of the process of writing one of the words as a sequence of k decisions. 
Decision 1 is which of n spaces to fill with A{&. Because this amounts to selecting 
r\ of the n available positions, it involves C(n, n) choices. Decision 2 is which of 
the remaining n — r x spaces to fill with A 2 's. Since there are r 2 of these characters, 
the second decision can be made in any one of C(n — r\,r 2 ) ways. Once the Ai's 
and A 2 s have been placed, there are n — r\ — r 2 positions remaining to be filled, 
and Aj,'s can be assigned to rj, of them in C(n — r\ — r 2 , r?) ways, and so on. By 
the fundamental counting principle, the number of ways to make this sequence of 
decisions is the product 

C(n,n) x C(n - r u r 2 ) x C(n - r x - r 2 , r 3 ) x • • • x C(n - r\ ~ r 2 r k _ x r k ). 

(Because r\ + r 2 -\ h r k = n, the last factor in this product is C(r k , r k ) = 1.) 

■ 

It turns out that both binomial and multinomial coefficients have their unique 
qualities and uses. Keeping both is vastly more convenient than eliminating 
either. 

Let's do some mathemagic. Pick a number, any number, just so long as it is an 
entry from Pascal's triangle. Suppose your pick happened to be 15 = C(6, 2). Start- 
ing with C(2, 2), the first nonzero enry in column 2 (the third column of Fig. 1.5.1), 



C(0,0) 










C(1,0) 


C(U) 








C(2,0) 


C(2,l) 


C(2,2) 
+ 






C(3,0) 


C(3,l) 


C(3,2) 
+ 


C(3,3) 




C(4,0) 


C(4,l) 


C(4,2) 
+ 


0(4,3) 


C(4,4) 


C(5,0) 


C(5,l) 


C(5,2) 


C(5,3) 


C(5,4) 


C(6,0) 


C(6,l) 


+ 

C(6,2) 


C(6,3) 


C(6,4) 


C(7,0) 


C(7,l) 


C(7,2) 
Figure 1.5.1 


C(7,3) 


C(7,4) 
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add the entries down to and including C(6, 2). The sum will be C(7, 3). Check it 
out: 

C(2, 2) + C(3, 2) + C(4, 2) + C(5, 2) + C(6, 2) = 1 + 3 + 6 + 10 + 15 

= 35 

= C(7,3). 

The trick is an easy consequence of Pascal's relation and the fact that 
C(2, 2) = C(3, 3). (See if you can reason it out before reading on.) 

1.5.2 Chu's Theorem.* If n>r, then 

n 

J2 C(k, r) = C{r, r) + C{r + 1, r) + C{r + 2, r) + ■ ■ ■ + C{n, r) 
= C{n+\,r+ 1) 

{where Y^k=o C(^> r ) = J2t=r C(^> r ) because C(k, r) = 0, k < r). 

Proof. Replace C(r, r) with C(r + l,r+ 1) and use Pascal's relation repeatedly to 
obtain 

C(r+ l,r+ 1) + C(r+ l,r) = C(r + 2,r+ 1), 

C(r + 2,r+ l) + C(r+2,r) = C(r + 3,r+ 1), 

and so on, ending with 

C(n,r+l) + C(n,r) = C(n+l,r+l). ■ 

Chu's theorem has many interesting applications. To set the stage for one of 
them, we interrupt the mathematical discussion to relate a story about the young 
Carl Friedrich Gauss. At the age of seven, Gauss entered St. Katharine's 
Volksschule in the duchy of Brunswick. One day his teacher, J. G. Buttner, assigned 
Gauss's class the problem of computing the sum 

1+2+ ••• + 100. 

'Rediscovered many times, Theorem 1.5.2 can be found in Chu Shih-Chieh, Precious Mirror of the Four 
Elements, 1303. 

''"Gauss (1777-1855) is one of the half-dozen greatest mathematicians of the last millenium. 
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While his fellow pupils went right to work computing sums, Gauss merely stared at 
his slate and, after a few minutes, wrote 

100 X 101 = 5050. 
2 

He seems to have reasoned that numbers can be added forwards or backwards, 

1+2+ 3 + ••• + 98 + 99+ 100, 
100 + 99 + 98 + •••+ 3+ 2+ 1, 

or even sidewards. Adding sidewards gives 1 + 100= 101, 2 + 99 = 101, 
3 + 98= 101, and so on. With each of the hundred columns adding to 101, 
the sum of the numbers in both rows, twice the total we're looking for, is 
100 x 101. 

Gauss's method can just as well be used to sum the first n positive integers: 
, „ n(n+l) 

1+2+ -" + " = A ^ (1.6) 
= C(n+ 1,2). 

Seeing the answer expressed as a binomial coefficient may seem a little con- 
trived, but, with its left-hand side rewritten as C(l, 1) + C(2, 1) H h C(n, 1), 

Equation (1.6) is seen to be the r — 1 case of Chu's theorem! 

There is a formula comparable to Equation (1.6) for the sum of the squares of 
the first n positive integers, namely, 

l 2 + 2 2 + ... + W 2 = w(w+1)(2w+1) . (1.7) 

6 

Once one has seen it (or guessed it), Equation (1.7) is easy enough to prove by 
induction. But, where did the formula come from in the first place? Chu's theorem! 
Summing both sides of 

k 2 = k + k(k-\) 

(1-8) 

= C(k,l) + 2C(k,2), 

we obtain 



J2k 2 = J2c(k,l) + 2j2c(k,2)- 
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Two applications of Chu's theorem (one with r = 1 and the other with r = 2) yield 

n 2 = C{n+ 1,2) + 2C(« + 1,3) 
_(n+l)n (n+l)n(n-l) 

~T~ ^ 



l 2 + 2 2 - 



2 6 
"3 + 2(n - 1) 

«(«+ l)(2n+ 1) 



n(n+ 1) 



precisely Equation (1.7). 

What about summing mth powers? If we just had an analog of Equation (1.8), 
i.e., an identity of the form 



k m = J2a nm C(k,r) 



(1.9) 



(where a r ^ m is independent of k, 1 < r < m), we could sum both sides and use Chu's 
theorem to obtain 



k=l r=\ 
m n 

r=\ k=l 
m 

= J2 a r,mC(n+l,r+ 1). 



(1.10) 



To see what's involved when m = 3, consider the equation 

k 3 =xC{k, l)+yC{k,2)+ z C{k,3) 

= xk + \yk(k-\) + \zk(k-l)(k-2), 
2 6 

which is equivalent to 

6k 3 = (6x -3y + 2z)k + (3y - 3z)k 2 + zk 3 . 

(Check it.) Equating coefficients of like powers of the integer variable k yields the 
system of linear equations 

6x-3y + 2z = 0, 
3y - 3z = 0, 
z = 6, 
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which has the unique solution y = z = 6 and x=\. (Confirm this too.) 
Therefore, 

k 3 = C{k, 1) +6C{k,2) + 6C(k,3) (1.11) 

or, in the language of Equation (1.9), fli,3 = x = 1, «2,3 = y = 6, and 033 = z = 6. 
Together, Equations (1.9)— (1.11) yield 

l 3 + 2 3 + • • • + w 3 = C(« + 1, 2) + 6C(w +1,3) + 6C(n + 1,4) 

_ rc 2 (rc+ l) 2 
~~ 4 ' 

(Confirm these computations.) 

Now we know where formulas for sums of powers of positive integers come 
from. They are consequences of Chu's theorem as manifested in Equations (1.9)- 
(1.10). From a theoretical point of view, that is all very well. The disagreeable part 
is the prospect of having to solve a system of m equations in m unknowns in order to 
identify the mystery coefficients a rjn . In fact, there is an elegant solution to this 
difficulty! 

In the form 

m 
r=l 

Equation (1.9) is reminiscent of matrix multiplication. To illustrate this perspective, 
let m = 6 and consider that portion of Pascal's triangle lying in rows and columns 
numbered 1-6, i.e., 

1 

2 1 

3 3 1 

4 6 4 1 

5 10 10 5 1 

6 15 20 15 6 1 

Filling in the zeros corresponding to C(n,r), n < r < 6, we obtain the matrix 



C 6 = 



/I 


0 


0 


0 


0 




2 


1 


0 


0 


0 


0 


3 


3 


1 


0 


0 


0 


4 


6 


4 


1 


0 


0 


5 


10 


10 


5 


1 


0 


\6 


15 


20 


15 


6 


1/ 
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Anyone familiar with determinants will see that this matrix has an inverse. It is 
one of the most remarkable properties of binomial coefficients that C" 1 can be 
obtained from C„, just by sprinkling in some minus signs, e.g., 



/ 1 


0 


0 


0 


0 


o\ 


-2 


1 


0 


0 


0 


0 


3 


-3 


1 


0 


0 


0 


-4 


6 


-4 


1 


0 


0 


5 


-10 


10 


-5 


1 


0 


V-6 


15 


-20 


15 


-6 


1/ 



(Before reading on, confirm that the product of these two matrices is the identity 
matrix, 1$.) 

1.5.3 Definition. Let C n be the n x n Pascal matrix whose (/J)-entry is bino- 
mial coefficient C(ij'), 1 < i,j < n. 

1.5.4 Alternating-Sign Theorem. The Pascal matrix C n is invertible; the (ij)- 
entryofC- 1 is (-l) i+j C(i,j). 

While it may seem a little like eating the dessert before the broccoli, let's defer 
the proof of the alternating-sign theorem to the end of the section and go directly to 
the application. 

1.5.5 Theorem. If m and r are positive integers, the coefficient ofC(k, r) in the 
equation k m = Y^Li a r,mC(k, r) is given by 

m 

« r , m -]T(- 1 ) r+ ' c ( r < f ) ?m 
t=i 

This more-or-less explicit formula for a rm eliminates the need to solve a system 
of equations. Put another way, Theorem 1.5.5 solves the corresponding system of m 
equations in m unknowns, once and for all, for every m. 

Proof of Theorem 1.5.5. Suppose n > m,r. Let A n — (a, ,j ) be the n x n matrix of 
mystery coefficients (where a rm = 0 whenever r > m). Then, by Equation (1.9), the 
(k, m) -entry of C n A n is 

m 

^ ^ C(&, r )a r j n = k , 

r=l 

1 < k, m < n. In other words, C„A n = P n , where P n is the n x n matrix whose (i,j)- 
entry is if Thus, A n = C~ x P n , so the mystery coefficient a rm is the (r, m)-entry of 
the matrix product C~ l P„. ■ 
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7.5.6 Example. Let's reconfirm Equation (1.11). By Theorem 1.5.5, 

fl 1;3 = (-l) 1+1 C(l,l)l 3 = l, 

02,3 - (-1) 2+I C(2, 1)1 3 + (-1) 2+2 C(2,2)2 3 
= -2 + 8 = 6, 

a 3 ,3 = (-1) 3+I C(3, 1)1 3 + (-1) 3+2 C(3,2)2 3 + (-1) 3+3 C(3,3)3 3 
= 3 - 24 + 27 = 6; 

i.e., with m = 3, Equation (1.9) becomes k 3 = C(k, 1) + 6C(k, 2) + 6C(k, 3). □ 

In fact, it isn't necessary to compute a r m for one value of r at a time, or even for 
one value of m at a time! Using matrices, we can calculate the numbers a r>m , 
1 < r < m, 1 < m < n, all at oncel 



1.5.7 Example. When n = 4, 



Pa 



/I' 

2 1 

3 1 
\4» 



l 2 
2 2 
3 2 
4 2 



l 3 

2 3 
3 3 
4 3 



1 4 \ 

2 4 

3 4 
4V 



So, 



d ^4 



/ 


i 


0 


0 




2 


1 


0 




3 


-3 


1 


V- 


4 


6 


-4 


/l 


1 


1 


!\ 


0 


2 


6 


14 


0 


0 


6 


36 


Vo 


0 


0 


24/ 



0 
0 

1/ 



/I 

2 
3 
V4 



1 

4 
9 
16 



27 
64 



1\ 
16 

81 

256/ 



:A 4 . 



(Check the substitutions and confirm the matrix multiplication.) Observe that 
column 3 of A4 recaptures Equation (1.11), column 2 reconfirms Equation (1.8), 
and column 1 reflects the fact that k 1 = k = C(k, 1). Column 4 is new: 



k 4 = C(k, 1) + UC(k,2) + 36C(£,3) + 24C(it,4). 



(1.12) 
□ 



So much for the desert. It's time for the broccoli. 



Proof of the Alternating-Sign Theorem. Given annxn matrix C = (c,y), recall 
that the n x n matrix B = (by) is its inverse if and only if CB = I n if and only if 
BC = I„. Let C = C„ be the n x n Pascal matrix, so that Cy = C(i,j). In the context 
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of Theorem 1.5.4, we have a candidate for C \ namely, the matrix B, whose 
entry is by — (—l)' +1 C(i,j). With these choices, CB = I„ if and only if 

£c(i,k)(-l) k+} C(kJ) = 8ij, (1.13a) 

k=l 

1 < i, j < n, and BC = /„ if and only if 

Y^(-l) M C(i,k)C(k,j)=dij, (1.13b) 



k=l 



1 < i, j < n, where 



hi 



1 if r =7, 
0 otherwise 



is the so-called Knonecker delta. 

Let's prove Equation (1.13a). Because C(i, k) = 0, k > i, and C(k,j) = 0, k < j, 

J2c(i,k)(-l) k+j C(kJ) = J2(-l) k+j C(i,k)C(k,j). 

k=l k=j 

If j > i, the right-hand sum is empty, meaning that the left-hand sum is zero. (So 
far, so good.) If i > k >j, then (confirm it) C(i,k)C(k,j) — C(i,j)C(i —j,k—j). 
Substituting this identity into the right-hand sum yields 

j2(-i) k+j c(ij)c(i -j,k-j) = c(ij) j2(-iy +k c(i -j,k-j) 

k=i k=j 



= C(iJ)J2(-lYC(i-j,r), 

where r = k — j. If i = j, this expression contains just one term, namely, C(i, i)x 
(— 1)°C(0, 0) = 1. So, to complete the proof of Theorem 1.5.4, it remains to estab- 
lish the following. ■ 

1.5.8 Lemma. Ifn>0, then ]T" =0 (-l) r C(n, r) = 0. 

7.5.9 Example. With n = 5, Lemma 1.5.8 becomes 

C(5, 0) - C(5, 1) + C(5, 2) - C(5, 3) + C(5, 4) - C(5, 5) = 0, 

which is an immediate consequence of symmetry: C(5,2) = C(5,3), C(5, 1) = 
C(5,4), and C(5,0) = C(5,5). If n = 4, the identity 

C(4,0) - C(4, 1) + C(4,2) - C(4,3) + C(4,4) = 1- 4 + 6-4+1 

= 0, 

while just as valid, is a little less obvious. □ 
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Proof of Lemma 1.5.8. The lemma follows from the binomial theorem, which will 
be taken up in section 1.7. It is easy enough, however, to give a direct proof. 
Observe that the conclusion is equivalent to 



J2 C(n,r) = J2c(n,r), 



r odd 

i.e., the number of subsets of T = {1, 2, . . . , n} having even cardinality is equal to 
the number of subsets of T with odd cardinality. 

Temporarily denote the family of all 2" subsets of T by 3F. We will prove the 
result by exhibiting a one-to-one, onto function*/ : !F — > !F such that Aef has 
an even (odd) number of elements if and only if f(A) has an odd (even) number. If 
n = o(T) is odd, the function defined by /(A) = T\A — {x <E T : x £ A}, the com- 
plement of A, meets our needs. (This is the easy case, illustrated for n = 5 in Exam- 
ple 1.5.9.) If n is even, the function defined by 



J A U {«} when n $ A, 
\A\{n} when n G A 



f(A) 

satisfies our requirements. 

1.5.10 Example. Some values of the function 

/(A) 

(corresponding to n = 4) are given in Fig. 1.5.2. 



AU{4} when 4^ A, 
A\{4} when 4 G A 



A 


f(A) 


<P 


{4} 


{1} 


{1,4} 


{2} 


{2,4} 


{3,4} 


{3} 


{1,3,4} 


{1,3} 


{1,2,3,4} 


{1,2,3} 



□ 



Figure 1.5.2 



1.5. EXERCISES 

1 Prove that 

(a) The sum 2 + 4 + 6 + 

(b) The sum 1 + 3 + 5 + 



+ 2n of the first n even integers is n(n +1). 
+ (2m — 1) of the first n odd integers is n 2 . 



* One-to-one, onto functions are also known as bijections. 
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2 Evaluate 

(a) EL '•('•-!)■ 0») E"=i »•('•+!)• 

(c) £^(2i-l). (d) E"=i '■('■- IK' - - 2). 

3 A sequence of numbers a\,a 2 , ■ ■ ■ is arithmetic if there is a fixed constant c 
such that a, + i — a, ■ — c for all i>\. For such a sequence, show that 

(a) a n+i = a\ + nc. (b) E"=i a > = \ n i a \ + ««)• 

4 The proof of Theorem 1.5.1 given in the text is the combinatorial proof. Sketch 
the algebraic proof, i.e., write each of the binomial coefficients in terms of 
factorials and do lots of cancelling to obtain the multinomial coefficient. 

5 Show that 

(a) C(r k ,r k ) x Cfa-i + r k ,r k _ 1 ) x ••• x C(n + r 2 H \-r k ,n) 

-( " )■ 

\n,r 2 ,...,r k J 

» (;) + Ct') + (" 2 )— (r)^( r+ : +1 > 

6 Use mathematical induction to prove that l 3 + 2 3 + • • • + m 3 = \n 2 {n + l) 2 . 

7 Confirm (by a brute-force computation) that 

k 4 = C(k, 1 ) + UC{k, 2) + 36C(k, 3) + 2AC{k, 4) . 

8 Prove that l 4 + 2 4 + • • • + n 4 = ±n(n + l)(2n + 1)(3« 2 + 3n - 1) 

(a) using Equations (1.9)-(1.10) and (1.12). 

(b) using mathematical induction. 

9 Solve for the coefficients a r ,5, 1 < r < 5, in the equation £ 5 = Er=i a r,sC(k, r) 

(a) using the matrix equation A 5 = C^'Ps. 

(b) by solving a system of five equations in five unknowns without using the 
matrix equation. 

10 What is the formula for the sum of the fifth powers of the first n positive 
integers? (Hint: Lots of computations afford lots of opportunities to make 
mistakes. Confirm your formula for three or four values of n.) 

11 Suppose f and g are functions of the positive integer variable n. If f(ri) = 
E"=i C(n, r)g(r) for all n > 1, prove that g(n) = E;=i(-l)" +r C(«, r)f(r) for 
all n > 1. 

12 If m > n, prove that 

(a) E"=i C(m, r)C(n — 1, r — 1) = C(m + « — 1, «). 

(b) E"=i r C{m, r)C(n, r) = nC(m + n — 1, n). 
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13 Prove that 1x2+2x3+3x4H \-nx (n+l) = \n{n + l)(n + 2). 

14 Prove that 1x2x3+2x3x4H h n(n + 1)(« + 2) = \n{n + 1) x 

(« + 2)(m + 3). 

15 Prove Vandermonde's identity*: If m and n are positive integers, then 

C(m, 0)C(n, r) + C(m, l)C(n, r — 1) H h C(m, r)C(n, 0) = C(m + n, r). 

16 Prove that YTr=o C( n > r ) 2 = C(2«, n). (Compare with Exercise 11, Section 1.2.) 

17 How many of the C(52, 5) different five-card poker hands contain 
(a) a full house? (b) four of a kind? 

18 How many of the C(52, 13) different 13-card bridge hands contain 
(a) all four aces? (b) a 4-3-3-3 suit distribution? 

19 Show that 

(a) E" r = 1 (-iy- l [C(n,r-l)/r} = l/(n+l). 

(b) EU(-mC(n,r)/(r+ l)] = l/(»+ 1). 

(c) E^i(-l) r " 1 [C(«,r)/r] = ELi !/*• 

(d) C^V = w t , where v = (1/2, 1/3, . . . , l/[m+ 1]) and w= (1/2, -2/3, 

3/4, -4/5,..., [(~l) m+1 m/(^ +!)])■ 

(e) C m vv' = v', where v and w are the vectors from part (d). 

(f) Confirm the m = 6 case of part (e); i.e., write down the 6x6 matrix 
and confirm that C^w 1 = v l . 

20 Let n be fixed. Denote the rth-power sum of the first n — 1 positive integers by 
g(r) = V + 2 r + ■ ■ ■ + (n - l) r . Show that 

(a)*(0)=n-l. (b) g(l) = i« 2 -i«. 

(C) g(2) = |n 3 - \n 2 + in. (d) g(3) = |« 4 - i« 3 + |« 2 . 

(e) g(4)=i W 5 -i« 4 + | M 3 -i M . 

21 The wfh Bernoulli number, b r , is the coefficient of n in the function g(r) of 
Exercise 20. The first few Bernoulli numbers are exhibited in Fig. 1.5.3. Jakob 
Bernoulli (1654-1705) showed that the remaining coefficients in g(r), r > 1, 



r 


0 


1 


2 


3 


4 


br 


1 


1 

2 


1 

6 


0 


1 

30 



Figure 1.5.3. Bernoulli numbers. 



'Named for Abnit-Theophile Vandermonde (1735-1796), who published the result in 1772 (469 years 
after it appeared in Chu Shih-Chieh's book). 
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can be expressed in terms of the b r 's by means of the identity 
8(r) = i2j^TC(r,k)b^ +1 . 

(a) use the r = 4 case of this identity, along with Fig. 1.5.3, to recapture the 
expression for g(4) in Exercise 20(e). 

(b) Show that your solution to part (a) is consistent with Exercise 8. 

(c) Compute g(5). 

(d) Show that your solution to part (c) is consistent with your solution to 
Exercise 10. 

22 The Bernoulli numbers (Exercise 21) satisfy the implicit recurrence J2k=o 
C(r+ l,k)bk = 0, r > 1. Use this relation (and Fig. 1.5.3) to show that 

(a) b 5 = 0. (b) be = h- (c) h = 0. 



(d) h = 



30' 



(b) b 6 = i. 
(e) b 9 = 0. 



(I) b w 



_5_ 
66- 



23 Let n be fixed. Prove that the function g(r) = V + T + ••■ + (« — l) r , from 
Exercise 20, can expressed in the form c r,kn k , where the coefficients 
satisfy the recurrence (k + \)c r ^+\ — rc r -\^ for all r, k > 1. 

24 Use Exercises 20(e) and 23 and the fact that g{r) = 1 when n~2io compute 
8(5). 



25 Let r and s be integers, 0 < r < s, and let 



( C(r,r) C{r,r+\) 
C(r+l,r) C(r+\,r+\) 



V C(j,r) C(s,r+1) 



C(r,s) \ 
C(r+ l,s) 



C(s,s) J 



(a) Show that C\ x _„] = C„. 

(b) Exhibit C m . 

(c) Show that C[„] is an (i — r+ 1) -square matrix. 

(d) Show that the (/, /(-entry of C[ r ,] is C(r + i — 1, r +j — 1). 

(e) Show that Cr rjS i is invertible. 

(f) Exhibit Cp^. 

(g) Prove that the (ij')-entry of the inverse of C[„] is (-l) ,+i C(r + 1 - 1, 
r+j-l), 1 < i,j < s- r+ 1. 

(h) Let r be a nonnegative integer. If / and g are functions that satisfy 

/(«) = SL( for a11 M > f > P rove that 8( n ) = YX=t 

(-l) n+k C(n,k)f(k) forall«>r. 
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26 The Fibonacci sequence (Exercise 19, Section 1.2) may be denned by 

F 0 = Fi = 1 and F„ +] = F„ + F„_i, n > 1. 

(a) Show that F 4 = F 2 + 2F X + F 0 . 

(b) Show that F 5 = F 3 + 2F 2 + F { . 

(c) Show that F 6 = F 3 + 3F 2 + 3Fi + F Q . 

(d) Show that F-, = F 4 + 3F 3 + 3F 2 + F x . 

(e) Given that F 2 «+i = E"=o r ) F r+u prove that F 2n = J2" r =o c ( n , r ) F r- 

(f) Prove that F„ = YTr=o (-l)" +r C(n, r)F 2r . {Hint: Use part (e) and the t = 1 
case of Exercise 25(h).) 

27 If C = C[ 0jm ] is the matrix from Exercise 25, show that CK = L, where 



/C(0,0) 
C(1,0) 
C(2,0) 



C(l,l) 
C(2, 1) 
C(3,l) 



C(2,2) 
C(3,2) 
C(4,2) 



C(3,3) 
C(4,3) 
C(5,3) 



\C(m,0) C(m+ 1,1) C(m + 2,2) C(m + 3,3) 



C(m, m) \ 
C(m +\,m) 
C{m + 2,m) 

C(m + m,m) ) 



K = 



/C(0,0) 
0 
0 

V o 



C(l,l) 
C(1,0) 
0 



0 



C(2,2) 
C(2, 1) 
C(2,0) 

0 



C(3,3) 
C(3,2) 
C(3,l) 

0 



C(m,m) \ 
C(m,m — 1) 
C{m,m- 2) 

C(m,0) 7 



28 For a fixed but arbitrary positive integer m, prove that the coefficients a nm , 
1 < r < m, in Equation (1.9) exist and are independent of k. (Hint: Show that 

any polynomial f(x) — b m ^ + b m -\x m ~ l -\ + bo of degree at most m can 

be expressed (uniquely) as a linear combination of po(x), pi(x), . . . ,p m (x), 
where po(x) = 1 and p r {x) = (l/r\)x(x — 1) • • • (x — r + 1), r > 1.) 



1.6. FOUR WAYS TO CHOOSE 

The prologues are over. ... It is time to choose. 

— Wallace Stevens (Asides on the Oboe) 

From its combinatorial definition, w-choose-r is the number of different r-element 
subsets of an w-element set. Because two subsets are equal if and only if they con- 
tain the same elements, ( " ) depends on what elements are chosen, not when. In 
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computing C(n, r), the order in which elements are chosen is irrelevant. The 
C(5,2) = 10 two-element subsets of {L,U,C,K,Y} are 

{L, U}, {L, C}, {L, K}, {L, Y}, {U, C}, {U, K}, {U, Y}, {C, K}, {C, Y}, {K, Y}, 

where, e.g., {L, U} = {U, L}. There are, of course, circumstances in which order is 
important. 

1.6.1 Example. Consider all possible "words" that can be produced using two 
letters from the word LUCKY. By the fundamental counting principle, the number 
of such words is 5 x 4, twice C(5, 2), reflecting the fact that order is important. The 
20 possibilities are 

LU, LC, LK, LY, UC, UK, UY, CK, CY, KY, 

UL, CL, KL, YL, CU, KU, YU, KC, YC, YK. □ 

7.6.2 Definition. Denote by P(n, r) the number of ordered selections of r ele- 
ments chosen from an w-element set. 

By the fundamental counting principle, 

P{n, r) — n(n — l)(n 
= n(n — 1)(« 
n\ 

~ (n-r)\ 
= r\C{n, r). 

There is another way to arrive at this last identity: We may construe P(n, r) as 
the number of ways to make a sequence of just two decisions. Decision 1 is which 
of the r elements to select, without regard to order, a decision having C(n, r) 
choices. Decision 2 is how to order the r elements once they have been selected, 
and there are r\ ways to do that. By the fundamental counting principle, the number 
of ways to make the sequence of two decisions is C(n, r) x r\ = P(n, r). 

1.6.3 Example. Suppose nine members of the Alameda County School Boards 
Association meet to select a three-member delegation to represent the association 
at a statewide convention. There are C(9, 3) = 84 different ways to choose the dele- 
gation from those present. If the bylaws stipulate that each delegation be comprised 
of a delegate, a first alternate, and a second alternate, the nine members can comply 
from among themselves in any one of P(9, 3) = 3!C(9, 3) = 504 ways. □ 

1.6.4 Example. Door prizes are a common feature of fundraising luncheons. 
Suppose each of 100 patrons is given a numbered ticket, while its duplicate is 
placed in a bowl from which prize-winning numbers will be drawn. If the prizes 
are $10, $50, and $150, then (assuming winning tickets are not returned to the 



-2)...( B -[r-l]) 
-2) ■••(«- r+ 1) 
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bowl) a total of P(100, 3) = 970,200 different outcomes are possible. If, on the 
other hand, the three prizes are each $70, then the order in which the numbers 
are drawn is immaterial. In this case, the number of different outcomes is 



Both C(n, r) and P(n, r) involve situations in which an object can be chosen at 
most once. We have been choosing without replacement. What about choosing with 
replacement? What if we recycle the objects, putting them back so they can be cho- 
sen again? How many ways are there to choose r things from n things with replace- 
ment? The answer depends on whether order matters. If it does, the answer is easy. 
The number of ways to make a sequence of r decisions each of which has n choices 
is n r . 

1.6.5 Example. How many different two-letter "words" can be produced using 
the "alphabet" {L, U, C, K, Y}? If there are no restrictions on the number of times 
a letter can be used, then 5 2 = 25 such words can be produced; i.e., there are 25 
ways to choose 2 things from 5 with replacement if order matters. In addition to 
the 20 words from Example 1.6.1, there are five new ones, namely, LL, UU, CC, 



This brings us to the fourth way to choose. 

1.6.6 Example. In how many ways can r = 10 items be chosen from 
{A, B,C,D,E} with replacement if order doesn't matter? As so often happens in 
combinatorics, the solution is most easily obtained by solving another problem 
that has the same answer. Suppose, e.g., A were chosen three times, B once, C 
twice, D not at all, and E four times. Associate with this selection the 14-letter 
"word" 



In this word, the "letter" | represents a tally mark. Since we are choosing 10 times, 
there are ten |'s. The dashes are used to separate tally marks corresponding to one 
letter from those that correspond to another. The first three |'s are for the three A's. 
The first dash separates the three A tallies from the single tally corresponding to the 
only B; the second dash separates the B tally from the two C tallies. There is no 
tally mark between the third and fourth dashes because there are no Z)'s. Finally, 
the last four |'s represent the four £"s. Since {A, B, C,D,E} has n = 5 elements, 
we need 4 dashes to keep their respective tally marks separate. Conversely, any 
14-letter word consisting of ten |'s and four — 's corresponds to a unique selection. 

The word ||||| 1— 1|— , e.g., correspons to seven A's, no B's, one C, two D's, and 

no £"s. 

Because the correspondence is one-to-one, the number of ways to select r = 10 
things from n = 5 things with replacement where order doesn't matter is equal to 
the number of 14-letter words that can be made up from ten |'s and four — 's, i.e., to 



C(100,3) = 161,700. 



□ 



KK, and YY. 



□ 




C(14, 10) = 1001. 



□ 
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Order 


Order 




matters 


doesn't matter 


Without replacement 


P(n,r) 


C(n,r) 


With replacement 


n r 


C(r + n-l,r) 



Figure 1.6.1. The four ways to choose. 



1.6.7 Theorem. The number of different ways to choose r things from n things 
with replacement if order doesn't matter is C(r + n — 1, r). 

Proof. As in Example 1.6.6, there is a one-to-one correspondence between selec- 
tions and [r + (n — l)]-letter words consisting of r tally marks and n — 1 dashes. 
The number of such words is C(r + n — 1 , r) . ■ 

1.6.8 Example. Let's return to the door prizes of Example 1.6.4, but, this time, 
suppose that winning tickets are returned to the bowl so they have a chance to be 
drawn again. When the prizes are different, the r ~ 3 winning tickets are chosen 
from the n = 100 tickets in the bowl with replacement where order matters, and 
100 3 = 1 million different outcomes are possible. When the prizes are all the 
same (choosing with replacement when order doesn't matter), the number of differ- 
ent outcomes is only C(3 + 100 - 1,3) = C(102,3) = 171,700. □ 

The four ways to choose are summarized in Fig. 1.6.1. Because C(r + 
n — 1, r) = C(r + n — 1, n — 1) ^ C(r + n — 1,«), it is important to remember 
that in the last column of the table each entry takes the form C(*,r), where r is 
the number of things chosen, replacement or not. (Don't expect this second variable 
always to be labeled r.) 

Choosing with replacement just means that elements may be chosen more than 
once. If order doesn't matter, then the only thing of interest is the multiplicity with 
which each element is chosen. As we saw in Example 1.6.6, C(14, 10) = 1001 
different outcomes are possible when choosing 10 times from {A,B, C,D,E} 
with replacement when order doesn't matter. If, in one of these outcomes, A is cho- 
sen a times, B a total of b times, and so on, then 

a + b + c + d + e= 10. (1.14) 

Evidently, each of the 1001 outcomes gives rise to a different nonnegative integer 
solution to Equation (1.14), and every nonnegative integer solution of this equation 
corresponds to a different outcome. In particular, Equation (1.14) must have 
precisely 1001 nonnegative integer solutions! The obvious generalization is this. 

1.6.9 Corollary. The equation x x + x 2 + ■ ■ ■ + x n = r has exactly C(r + 
n— 1 , r) nonnegative integer solutions. 
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What about positive integer solutions? That's easy! The number of positive inte- 
ger solutions to Equation (1.14) is equal to the number of nonnegative integer solu- 
tions to the equation 

{a - 1) + {b - 1) + (c - 1) + {d - 1) + {e - 1) = 10 - 5, 

namely, to C(5 + 5 - 1,5) = C(9,5) = 126. [Of the 1001 nonnegative integer 
solutions to Equation (1.14), at least one variable is zero in all but 126 of them.] 

1.6.10 Definition. A composition* of n having m parts is a solution, in positive 
integers, to the equation 

n = x l +x 2 -\ Yx m . (1-15) 

Notice the change in notation. This is not deliberately meant to be confusing. 
Notation varies with context, and we are now moving on to a new idea. It might 
be useful to think of the integer variables n, r, k, m, etc., as a traveling company 
of players whose roles depend upon the demands of the current drama production. 

A composition expresses n as a sum of parts; 7 = 5 + 2 is a two-part composi- 
tion of 7, not to be confused with 7 = 2 + 5. In the first case, x\ = 5 and x 2 = 2; in 
the second, x\ = 2 and x 2 = 5. Never mind that addition is commutative. A com- 
position is an ordered or labeled solution of Equation (1.15). The six two-part com- 
positions of n = 7 are 6 + 1, 5 + 2, 4 + 3, 3 + 4, 2 + 5, and 1 + 6, corresponding, 
e.g., to the six ways to roll a 7 with two dice (one red and one green). 

1.6.11 Theorem. The number of m-part compositions of n is C(n — l,m — 1). 

Proof. The number of positive integer solutions to Equation (1.15) is equal to the 
number of nonnegative integer solutions to 

(x\ — 1) + (x 2 — 1) H h {x m - 1) = n - m. 

By Corollary 1.6.9, this equation has C{[n — m] + m — l,n — m) = C(n — 1, 
n — m) = C(n — l,m— 1) nonnegative integer solutions. ■ 

7.6.72 Example. The C(6 - 1,3 — 1) = C(5,2) = 10 three-part compositions 
of 6 are illustrated in Fig. 1.6.2. □ 

1.6.13 Corollary. The (total) number of compositions of n is 2 n ~ x . 



*The term was coined by Major Percy A. MacMahon (1854-1929). Decomposition might be a more 
descriptive word. 
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Figure 1.6.2 



Proo/ The number of compositions of n is the sum, as m goes from 1 to n, of the 
number of m-part compositions of n. According to Theorem 1.6.11, that sum is 
equal to 

C(n- 1,0) + C(n- 1,1) + ••• + C(n- l,n- 1), 
the sum of the numbers in row w — 1 of Pascal's triangle. ■ 

By Corollary 1.6.13, there are 2 5 = 32 different compositions of 6. Ten of them 
are tabulated in Fig. 1.6.2. You will be asked to list the remaining 22 compositions 
in Exercise 11, but why not do it now, while the idea is still fresh? 

1.6.14 Example. How many integer solutions of x + y + z = 20 satisfy x > 1, 
y > 2, and z > 3? Solution: x + y + z = 20 if and only if (x - 1) + (y - 2)+ 
(z — 3) = 14. Setting a = x — 1, b = y — 2, and c = z — 3 transforms the problem 
into one involving the number of nonnegative integer solutions of a + b + c = 14. 
By Corollary 1.6.9, the answer is C(14 + 3 - 1, 14) = 120. □ 

1.6.15 Example. Some people are suspicious when consecutive integers occur 
among winning lottery numbers. This reaction is probably due to the common mis- 
conception that truly random numbers would be "spread out". Consider a simple 
example. Of the C(6, 3) = 20 three-element subsets of {1,2,3,4,5,6}, how many 
fail to contain at least one pair of consecutive integers? Here is the complete list: 
{1,3,5}, {1,3,6}, {1,4,6}, and {2,4,6}. 

What about the general case? Of the C(n, r) r-element subsets of 
S = {1,2,. ..,«}, how many do not contain even a single pair of consecutive inte- 
gers? Recall the correspondence between r-element subsets of S and «-letter 
"words" consisting of r T's and n — r N's. In any such word, w, there will be 
some number, x 0 , of N's that come before the first Y, some number x x of iV's 
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between the first and second Y, some number x-i of TV's between the second and 
third Y, and so on, with some number x r or TV's coming after the last (rth) Y. Since 
w must contain a total of n — r N's, it must be the case that 

Xo + X\ + • • • + x r = n — r. 

Every r-element subset of S corresponds to a unique solution of this equation, in 
nonnegative integers, and every nonnegative integer solution of this equation cor- 
responds to a unique r-element subset of S. (Confirm that C([n — r] + [r + 1] — 1, 

[n-r)) = C(n,r).) 

In this correspondence between subsets and words, a subset contains no conse- 
cutive integers if and only if x,- > 0, 1 < i < r — 1. If we substitute yo = xq, y r = x r , 
and y, ■ = x, ■ — 1, 1 < i < r — 1, then, as in Example 1.6.14, the answer to our pro- 
blem is equal to the number of nonnegative integer solutions of 

v 0 +y\ H \-y r = {n-r)-{r-l) 

= n-2r+\, 

i.e., to 

C([n-2r+ 1] + [r+ 1] - \,[n-2r+ 1]) = C(n - r + 1, n - 2r + 1) 

= C{n-r+l,r). 

(Be careful, C(« - r + 1, r) ^ C(r + n - 1, r).) 

When n = 6 and r = 3, C(6 — 3 + 1,3) = C(4, 3) = 4, confirming the result of 
the brute-force list in the first paragraph of this example. □ 



1.6. EXERCISES 

1 Compute 

(a)P(5,3). (b) C(5,3). (c) C(5,2). 

(d)P(5,2). (e) C(10,4). (f) P(10, 4). 
(g) 7!. 

2 Show that 

(a) nP{n - l,r) = P(n,r+ 1). 

(b) P(n+ \,r) = rP(n,r- l)+P(n,r). 

3 In how many ways can four elements be chosen from a seven-element set 

(a) with replacement if order doesn't matter? 

(b) without replacement if order does matter? 

(c) without replacement if order doesn't matter? 

(d) with replacement if order matters? 
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4 In how many ways can seven elements be chosen from a four-element set 

(a) with replacement if order matters? 

(b) with replacement if order doesn't matter? 

(c) without replacement if order matters? 

(d) without replacement if order doesn't matter? 

5 In how many ways can four elements be chosen from a ten-element set 

(a) with replacement if order matters? 

(b) with replacement if order doesn't matter? 

(c) without replacement if order doesn't matter? 

(d) without replacement if order matters? 

6 In how many ways can seven elements be chosen from a ten-element set 

(a) without replacement if order matters? 

(b) with replacement if order doesn't matter? 

(c) without replacement if order doesn't matter? 

(d) with replacement if order matters? 

7 Show that multinomial coefficient ( nr " { ... x ) = P(n 7 r). 

8 Compute the number of nonnegative integer solutions to 
(a) a + b = 9. (b) a + b + c = 9. 

(c) a + b + c = 30. (d) a + b + c + d = 30. 

9 How many integer solutions of a + b + c + d = 30 satisfy 

(a) d > 3, c > 2, b > 1, a > 0? 

(b) a > 3, b > 2, c > 1, d > 0? 

(c) a > 7, b > 2, c > 5, d > 6? 

(d) a > -3, b > 20, c > 0, d > -2? 

10 Write down all 16 compositions of 5. 

11 Ten of the 32 compositions of 6 appear in Fig. 1.6.2. Write down the 
remaining 22 compositions of 6. 

12 How many compositions of 8 have 
(a) 4 parts? (b) 4 or fewer parts? 
(c) 6 parts? (d) 6 or fewer parts? 

13 Prove that the inequality x + y + z < 14 has a total of 680 nonnegative integer 
solutions. 
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14 Prove that the inequality x\ + x-i + • • • + x m < n has a total of C(n + m, m) 
nonnegative integer solutions. 

15 Starting with Fq = F t = 1, the Fibonacci numbers satisfy the recurrence 
F n = Fn-\ + Fn-2, n > 2. Prove that 

(a) F k+ „ = F k F n + F k _ x F n _ x , k,n>l. 

(b) F2k+\ is a multiple of F k , k > 1. 

(c) F^k+2 is a multiple of F k , k > 1. 

16 Let F„, n > 0, be the «th Fibonacci number. (See Exercise 15.) Prove that 

(«) (! i)" +1 = (£ + v:;)<">i- 

(b) F n+x F n _ x =F„ 2 + (-ir +1 . 

(c) F„ and F n+ \ are relatively prime. 

17 Let m be a positive integer. Prove that there is a composition of n each of whose 
parts is a different Fibonacci number. (See Exercise 15.) 

18 Let p„ be the number of compositions of n each of whose parts is greater 
than 1. 

(a) Show that p 6 = 5 by writing down the compositions of 6 each of whose 
parts is at least 2. 

(b) Show that p 7 = 8. 

(c) If n > 2, prove that p n is a Fibonacci number. (Hint: Exercise 19, 
Section 1.2.) 

19 Let l n be the number of compositions of n each of whose parts is at most 2. If 
n > 1, prove that /„ = F n , the wth Fibonacci number. 

20 The first "diagonal" of Pascal's triangle consists entirely of l's. The second is 
comprised of the numbers 1, 2, 3, 4, 5, ... . The fourth is illustrated in boldface 
in Fig. 1.6.3. Explain the relationship between the kth entry of the with diagonal 
and choosing, with replacement, from {1,2, ...,k} where order doesn't 
matter. 
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21 Suppose five different door prizes are distributed among three patrons, Betty, 
Joan, and Marge. In how many different outcomes does 

(a) Betty get three prizes while Joan and Marge each get one? 

(b) Betty get one prize while Joan and Marge each get two? 

22 Let A be the collection of all 32 compositions of 6. Let B be the 32-element 
family consisting of all subsets of { 1, 2, 3,4, 5}. Because o(A) = o(B), there is 
a one-to-one correspondence between A and B. 

(a) Prove that there are a total of 32! different one-to-one correspondences 
between A and B. 

(b) Of the more than 2.6 x 10 35 one-to-one correspondences between A and 
B, can any be described by an algorithm, or recipe, that transforms 
compositions into subsets? 

23 What about choosing with limited replacement? Maybe the fundraising 
patrons in Examples 1.6.4 and 1.6.8 should be limited to at most two prizes. 
How many different outcomes are possible, under these terms of limited 
replacement, if there are 100 patrons and 

(a) three different prizes? (b) three equal prizes? 

(c) four different prizes? (d) four equal prizes? 

24 Revisiting the "birthday paradox" (Exercises 20-21, Section 1.3), suppose 
each of k people independently chooses an integer between 1 and m 
(inclusive). Let p be the probability that some two of them choose the same 
number. 

(a) Show that p = 1 - P(m,k)/m k . 

(b) M. Sayrafiezadeh showed that p= 1 — [1 — (k/2m)] k ~ l as long as k < m, 
where " = " means "about equal". Find the error in Sayrafiezadeh's 
estimate when k = 23 and m = 365. 

25 Show that the number of compositions of n having k or fewer parts is 
N(n — l,k- 1) = C{n - 1,0) + C{n - 1, 1) + • • • + C(n - l,k- 1) (a num- 
ber involved in the sphere-packing bound of Section 1.4). 

26 There is evidence in tomb paintings that ancient Egyptians used astragali 
(ankle bones of animals) to determine moves in simple board games. In later 
Greek and Roman times it was common to gamble on the outcome of throwing 
several astragali at once. When an astragalus is thrown, it can land in one of 
four ways. Compute the number of different outcomes when five astragali are 
thrown simultaneously. 

27 Suppose you have four boxes, labeled A, B, C, and D. How many ways are 
there to distribute 

(a) ten identical marbles among the four boxes? 

(b) the numbers 0-9 among the four boxes? 
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28 Suppose, to win a share of the grand prize in the weekly lottery, you must 
match five numbers chosen at random from 1 to 49. 

(a) Of the C(49,5) = 1,906,884 five-element subsets of {1,2, . . . ,49}, how 
many contain no consecutive integers? {Hint: Example 1.6.15.) 

(b) Show that the probability of at least one pair of consecutive integers 
occurring in the weekly drawing is greater than |. 

29 Prove that the (total!) number of subsets of {1,2, ...,«} that contain no 
two consecutive integers is F n+l , the (w+l)st Fibonacci number. (See 
Exercises 15-19.) 



1.7. THE BINOMIAL AND MULTINOMIAL THEOREMS 

Two roads diverged in a wood, and I — 
I took the one less traveled by, 
And that has made all the difference. 

— Robert Frost (The Road Not Taken) 

Among the most widely known applications of binomial coefficients is the 
following. 

1.7.1 Binomial Theorem. // n is a nonnegative integer, then 

( x + y y = J2c(n,r) x y- r . 

r=0 

Three applications of distributivity produce the identity 

(x + y) 2 = (x + y)(x + y) 

= x(x + y)+y(x + y) (1.16) 
= xx + xy + yx + yy. 

The familiar next step would be to replace xx with x 2 , xy + yx with 2xy, and so on, 
but let's freeze the action with Equation (1.16). As it stands, the right-hand side of 
this identity looks as if it could be a sum of two-letter "words". There is an alter- 
native way to think about this word sum. 

Starting with the expression (x + y)(x + y), choose a letter, x or y, from the first 
set of parentheses, and one letter from the second set. Juxtapose the choices, in 
order, so as to produce what looks like a two-letter word. Do this in all possible 
ways, and sum the results. From this perspective, the right-hand side of 
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Equation (1.16) is a kind of inventory* of the four ways to make a sequence of two 
decisions. The term yx, e.g., records the sequence in which y is the choice for deci- 
sion 1, namely, which letter to take from the first set of parentheses, and x is the 
choice for decision 2. 

Applied to the expression 

(x + y) 3 = (x + y) 2 (x + y) 

= (xx + xy + yx + yy)(x + y), 

this alternative view of distributivity suggests the following process: Select a two- 
letter word from (xx + xy + yx + yy) and a letter from (x + y). Juxtapose 
these selections (in order), so as to produce a three-letter word. Do this in 
all (4x2 = 8) possible ways and sum, obtaining the following analog of 
Equation (1.16): 

xxx + xyx + yxx + yyx + xxy + xyy + yxy + yyy. (1-17) 

A variation on this alternative view of distributivity would be to picture 
(x + y) 3 = (x + y)(x + y)(x + y) in terms, not of two decisions, but of three. 
Choose one of x or y from the first set of parentheses, one of x or y from the second 
set, and one of x or y from the third. String the three letters together (in order) to 
produce a three-letter word. Doing this in all (2x2x2 = 8) possible ways and 
summing the results leads to Expression (1.17). However one arrives at that expres- 
sion, replacing words like xyx with monomials like x 2 y, and then combining like 
terms, produces the identity 

(x + y) 3 = x 3 + 3x 2 y + 3xy 2 +y 3 . (1.18) 

The two variations on our alternative view of distributivity afford two different 
routes to a proof of the binomial theorem. One is inductive: Given the binomial 
expansion of (x + y) n _1 , the computation of (x + y) n is viewed in terms of two 
decisions, as in (x + y)" = (x + y)" _I (x + y), and the proof is completed using 
Pascal's relation. In the second route, the expansion of (x + y)" is viewed in terms 
of n decisions. 

Proof of Theorem 1.7.1. Taking the route "less traveled by", we evaluate the 
right-hand side of the equation 

(x + yf = (x + y)(x + y)---(x + y) 



Using distributivity to inventory the ways to make a sequence of decisions is an idea of fundamental 
importance in Polya's enumeration theory (Chapter 3) and the theory of generating functions (Chapter 4). 
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in a series of steps. Begin by choosing one of x or y from the first set of parentheses, 
one from the second set, and so on, finally choosing one of x or y from the wth set. 
String the n choices together in order. Do this in all possible ways and sum the cor- 
responding M-letter words. The resulting analog of expressions (1.16)— (1.17) is both 
an inventory of the 2" ways to make a sequence of decisions and a vocabulary of all 
possible w-letter words that can be produced using the alphabet {x, y}. From this 
sum of words, the analog of Equation (1.18) is reached in two steps. Viewing x 
and y not as letters in an alphabet but as commuting variables, replace each w-letter 
word with a monomial of the form x r y"~ r . Then combine like terms. In the resulting 
two-variable polynomial, the coefficient of x r y n ^ r is the number of n-letter words in 
which r of the letters are x's and n-ror them are y's. That number is known to us 
as C(n, r). ■ 

Substituting x = y = 1 in the binomial theorem results in a new proof that 

2» = £c(n,r). 

Setting x= — 1 and y = 1 leads to another proof of Lemma 1.5.8, i.e., 

for all n > 1. New results can be derived by making other substitutions, e.g., x = 2 
and y = 1 yields an identity expressing 3" in terms of powers of 2, namely, 

3" = £c(«,r)2 r . (1.19) 



What happens if there are three variables? This is where the road less traveled by 
makes all the difference. Just as (x + y)(x + y) ■ ■ ■ (x + y) inventories the ways to 
make a sequence of n decisions each having two choices, (x + y + z) x 
(x + y + z) ■ ■ ■ (x + y + z) inventories the ways to make a comparable sequence 
of decisions each having three choices. From this perspective, the process of 
expanding (x\ + X2 + ■ ■ ■ + Xk) n is the same whether k — 2 or k — 100. 
Choose one of xi,X2, ■ ■ ■ ,Xk from each of n sets of brackets. String the 
choices together, in order, obtaining an n-letter word. Do this in all k" possible 
ways and sum. The resulting inventory is then simplified in two steps. First, 
each word is replaced with a monomial of (total) degree n, and then like terms 
are combined. At the end of this process, the coefficient of x^x^ ■ ■ ■ x r k l , is the num- 
ber of M-letter words that can be produced using r\ copies of x\, copies of X2, ■ ■ ■ , 
and r k copies of x k . This proves the following generalization of the binomial 
theorem. 
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1.7.2 Multinomial Theorem. If n is a nonnegative integer, then 

( Xl +x 2 + ---+x k ) n = yj _ ^)%%---%, (1-20) 



*-^\r u r 2 ,...,r k/ 

where the sum is over all nonnegative integer solutions to the equation r\ + 
r 2 + • • • + r k = n, and 

n 

n,r 2 ,...,r k J n\r 2 \---r k \ 

Because some of the r's in Equation (1.20) may be zero, the sum is not over the 
A:-part compositions of n. (Since 0! = 1, the definition of multinomial coefficient is 
easily modified so as to permit zeros among its entries.) 

1.7.3 Example. It isn't necessary to compute all 5 10 = 9,765,625 products in 
the expansion of (a + b + c + d + e) 10 just to determine the coefficient of a 4 d 6 l 
From the multinomial theorem, 

10 A 10! = 10! 



4,0,0,6,0/ 4!0!0!6!0! 4!6! 

Observe that 210 = C(10,4) is also the coefficient of a 4 d 6 in (a + d) 10 , just as it 
should be. Setting b = c = e = 0 in (a + b + c + d + e) W has no effect on the 

coefficient of a 4 d 6 . Also, observe that ( 40 1 o 60 ) = (oo6°0 4)- ^ ne coefficient of 
c 6 e 4 is also 210, reflecting the symmetry of (a + b + c + d + e) 10 . We will return 
to this point momentarily. □ 

The usefulness of the multinomial theorem is not limited to picking off single 
coefficients. The expansion of all 3 4 = 81 terms of (x + y + z) 4 , e.g., looks like 
this: 

1,2, 1/ V 1,0,3 

4 in i 'l a 

x 4 + • • • + xy z z + ■ ■ ■ + xz s + • • • + z 4 . 



1.7.4 Example. What is the coefficient of xy in the expansion of (1 + x + v) 5 ? 
Solution: Because xy — l^xy, the multinomial theorem can be applied directly. 
The answer is ( 3 j t ) = 20. Computing the coefficient of xy in (2 + x + y) 5 requires 
two steps. From the multinomial theorem, the coefficient of 2 3 xy is ( , ' t ) = 20. So, 
the xy-term in the expansion of (2 + x + y) 5 is 20 x 2 3 x xy, and the coefficient 
we're looking for is 20 x 8 = 160. 
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What about the coefficient of x 3 y 5 z 2 in (2x - 3y + 4z) 10 ? Since the coefficient of 
(2x) 3 (-3^) 5 (4z) 2 is ( 3 1 5 ° 2 ) = 2520, the coefficient of x 3 y 5 z 2 must be 2520x 
2 3 x (-3) 5 x 4 2 = -78, 382, 080. □ 

As with the binomial theorem, numerous identities can be obtained by making 
various substitutions for the variables in the multinomial theorem. Setting x = y = 
z = 1 in (x + y + z)'\ e.g., yields 

r+s+t=« \ ' ' / 

Together with Equation (1.19), this produces the curious identity 

ice*- e (,.,;,)■ 

The multinomial theorem tells us that x^x^ occurs among the k n products 

in the expansion of (x\ + X2 + ■ ■ ■ + x^f with multiplicity ( n r " ^ ), but it does 

not tell us how many different monomial terms of the form ( ^ r " )x['x r 2 ■ ■ ■ x r k k 
occur in the expansion. 

1.7.5 Theorem. The number of different monomials of degree n in the k vari- 
ables X\,X2, ••• ,Xk is C(n + k — \,ri). 

Proof. From Corollary 1 .6.9, the equation r\ + r-i + h = n has exactly 

C(n + k — 1,«) nonnegative integer solutions. ■ 

It makes perfect sense, of course, that the multinomial expansion of 
(x\ + X2 + ■■■ + Xk) n should consist of C(n + k — l,n) different monomial terms! 
In the first stage of computing 

(xi +x 2 -\ \-x k )(xi +x 2 -\ \-x k ) ■ ■ ■ (xi +x 2 H \-x k ), 

each w-letter word identifies one of the k" different ways to choose n times from 
{xi,x 2 , . . . ,x k } with replacement where order matters. After simplifying, each 
term in the resulting sum represents one of the C(n + k — 1 , n) different ways 
to choose n times from {xi, x 2 , ■ ■ ■ ,x k } with replacement where order doesn't 
matter. 
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The multinomial expansion of (x + y + z) 4 is a homogeneous polynomial* 
comprised of C(4 + 3 — 1,4) = 15 monomial terms, one of which is 



4 

1,2,1 



xy 2 z = \2xy 2 z. 



Because (x + y + z) 4 is symmetric^ ', its multinomial expansion must be symmetric 
as well. Because switching x and y would interchange, e.g., xy 2 z and x 2 yz, these two 
monomials must have the same coefficient in the expansion of (x + y + z) 4 . Indeed, 
( 1 2 1 ) = (211)' tne vame °f a multinomial coefficient does not change when two 
numbers in its bottom row are switched! Form either perspective, it is clear that 

I2x yz + I2xy z + \2xyz = I2(x yz + xy z + xyz ) 

is a summand in the expansion of (x + y + z) 4 , and it is natural to group these terms 
together. Organizing the remaining 12 terms in a similar fashion yields 

(x + y + z) 4 = (x 4 +y 4 + z 4 ) + 4(x 3 y + x 3 z + xy 3 + xz 3 + y 3 z + yz 3 ) 

+ 6(x 2 y 2 + x 2 z 2 + y 2 z 2 ) + \2{x 2 yz + xy 2 z + xyz 2 ). (1.22) 

The minimal symmetric polynomials^ on the right-hand side of this equation 
have the symbolic names 

M[,q{x,y,z) =x 4 + / + z 4 , 

M [3,\] ( x , y, z) = * 3 y + * 3 z + x y 3 + x ^ + y 3 z + yz 3 , 

M [2 ,2\ (x, y, z) = x 2 y 2 + x 2 z 2 + y 2 z 2 1 
(x, y, z) = x 2 yz + xy 2 z + xyz 2 . 

Using this terminology, Equation (1.22) can be expressed as 

(x+v + z) 4 =M| 4| (x,v,z)+ ^ 3 4 i ^M| 3il| (x,y,z)+ ^ ^jM m {x, y,z) 

+ (2iiK Mi(,,y,z) ' (L23) 



*Each term has the same (total) degree, in this case four. 
Switching (any) two variables does not change the polynomial. 

' "Minimal symmetric polynomial" is a descriptive name, these polynomials are known to experts as 
monomial symmetric functions. 
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7.7.6 Example. If 37y 3 z is among the monomial terms of a symmetric polyno- 
mial p(x,y,z), then 

37(x 3 y + x 3 z + xy 3 + xz 3 + y 3 z + yz 3 ) = 37M [3jl] (x, y, z) 

must be a summand of p(x,y,z). □ 

There is nothing quite like a mountain of superscripts and subscripts to dull 
one's enthusiasm. So, there must be very good reasons for tolerating them in an 
introductory text. With a little getting used to, Equation (1.23) offers the best 
way to get a handle on the multinomial theorem, and a whole lot more! Let's see 
some more examples. 

7.7.7 Example. By the multinomial theorem, 

(* + ^) 5 = E( fl j, c )AV, 

where the sum is over the nonnegative integer solutions toa + b + c = 5. The ana- 
log of Equation (1.23) is 

(x + y + z) 5 =M [5] (x,y,z) + ^ ^jM m (x,y,z) + ^ ' ^jM [X2] {x,y,z) 

+ (3 1 1 ) M [3.U](^3',z)+ ( 2 ^ l ) M P,2,i] (x,y,z), (1.25) 

where the C(5 + 3— 1,5) = 21 monomials of degree 5 have been organized into 
the minimal symmetric polynomials 

M[ 5] {x,y,z) =x 5 +y 5 +z 5 , 
M{4,i\ {x, y, z) = x 4 y + x A z + xy 4 + xz 4 + y 4 z + yz 4 , 
M[ 3 ,2] {x, y, z) = x 3 y 2 + x 3 z 2 + x 2 y 3 + x 2 z 3 + y 3 z 2 + y 2 z 3 , 
Af[ 3jl>1 ] (x,y,z) — x 3 yz + xy 3 z + xyz 3 , 

M [2,2,i] {x, y, z) = x 2 y 2 z + x 2 yz 2 + xy 2 z 2 . □ 

1.7.8 Example. The fifth power of a three-term sum was expanded in 
Example 1.7.7. Applying the multinomial theorem to the third power of a five- 
term sum produces 

(a + b + c + d + e) 3 = Mpj (a, b, c, d, e) + 3M[ 2 j] (a, b, c, d, e) 

+ 6M[nq(a,b,c,d,e), (1.26) 



It is just a coincidence that the 4th and 5th powers of x + y + z involve four and five minimal symmetric 
polynomials, respectively. The 6th power involves seven. 
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where 

M[ 3 ] (a, b, c, d, e) = a 3 + b 3 + c 3 + d 3 + e 3 , 
M[2 j i] (a, b, c, d, e) = (a 2 b + a 2 c + a 2 d + a 2 e) 

+ (ab 2 + b 2 c + b 2 d + b 2 e) H h (ae 2 + be 2 + ce 2 + de 2 ), 

(1.27) 

and 

^[1,1,1] (°j c ' e) = abc + abd + abe + acd + ace + ade 

+ bed + bee + bde + cde. (1 .28) 

□ 



1.7. EXERCISES 

1 What is the coefficient of x 5 in the binomial expansion of 
(a) (x + y) 5 l (b)(l+x) 7 ? (c)(l+x) 9 ? 

(d) (2 + x) 7 ? (e)(l + 2x) 7 ? (f)(l-x) 9 ? 
(g)(2-x) 4 ? (h) (2x + y) 4 l (i)(2x-3y) 8 ? 

2 What is the coefficient of x 2 y 3 in the multinomial expansion of 
(a) (x + y) 5 7 (b)(l+x + y) 5 ? 



(c) (1 +x + yfl (d) (2k -y) 

(e) (2 + X + 3;) 5 ? (f) (3 + 2x-3;) 8 ? 

(g) (x-y + z) 5 7 (h) (-3 + x-23, + z) 8 ? 

(i) (1 - 2x + 3y - 4z) 7 ? (j) (1 - 2x + 3y - 4z) 4 ? 

3 Confirm Equation (1.21) in the case 

(a) n = 4 by setting x = y = z = 1 in Equation (1.22). 

(b) n — 5 by setting x = y = z = 1 in Equation (1.25). 

4 Prove that k n = ^2 ( r " r ) , where the sum is over all nonnegative integer 
sequences (ri, ri, ■ ■ ■ , r^) that sum to n. 

5 Consider the multinomial expansion of (a + b + c + d + e) 3 from Example 1 .7.8. 

(a) Explain why 3 and 6 are the correct coefficients of Afp^j (a, b, c, d, e) and 
M[ liU ](fl, b, c, d, e), respectively. 
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(b) Explain why M[ 2 ,i](a,b,c,d,e) is a sum, not of C(5,2) = 10 monomials, 
but of P(5,2) =20. 

(c) Explain why M\ lA l ](a,b,c,d, e) is a sum, not of P(5,3) = 60 monomials, 
but of C(5,3) = 10. 

(d) Explain why the equation 5 + P(5, 2) + C(5, 3) = C(7, 3) is a confirming 
instance of Theorem 1.7.5. 

(e) Without doing any arithmetic, explain why 5 + 3P(5, 2) + 6C(5, 3) = 5 3 . 

6 Prove the following special case of Exercise 10(c), Section 1.2, by differ- 
entiating (1 + xf and setting x = 1: 

(;) +2 (") +3 (3) + "' +r (") + - + "(»)=" 2 "" 

7 Of the 66 terms in the multinomial expansion of (x + y + z) W , how many 
involve 

(a) just one variable? 

(b) exactly two variables? 

(c) all three variables? 

8 Show how Vandermonde's identity, 

C(m,0)C(n,r) + C(m, l)C(n,r- 1) H h C(m,r)C(n,0) = C(m + n,r), 

follows from the equation [x + l) m (x + 1)" = (x+ l) m+ ". 

9 Let n be a fixed but arbitrary positive integer. Multiply each multinomial 
coefficient of the form ( a ^ cd ) by {—\) h+d and add the results. Prove that the 
sum is zero. 

10 Compute the coefficient of 

(a) x 8 in (x 2 + l) 7 . 

(b) x 8 in (x 2 +x) 7 . 

(c) x 8 in (x 2 +x + l) 7 . 

(d) x 5 in (1 + x + x 2 ) 7 . 

(e) x 2 y 2 in (3 + xy + xz + yz) 4 . 

(f) x 2 y 2 z 2 in (3 + xy + xz + yz) 4 . 

11 Let n be a positive integer and p a positive prime. 

(a) Suppose 0 < r, ■ < p, 1 < i < n. Prove that ( n r ? r ) is a multiple of p. 

(b) Prove Fermat's "little theorem"*, i.e., thatM p — n is an integer multiple of p. 



'After Pierre de Fermat (1601-1665). 
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12 Give the (two-decision) inductive proof of the binomial theorem. 

13 Write out all the terms of the minimal symmetric polynomial 

(a) M [6A] (x,y,z) (b) M [5t5] (x,y,z) 

14 Denote the coefficient of x r in (1 + x + x 2 H h x k ~ l ) n by Q(n, r). 

(a) Show that C 2 (n, r) = C(«, r). 

(b) Compute C 3 (3,3). 

(c) If « > 1, show that Ct(n, r) = X^?=o — 1> r — 0- 

15 The multinomial expansion of (x + y + zf can be expressed as a linear 
combination of four minimal symmetric polynomials and the expansion of 
(x + y + z) 5 as a linear combination of five. How many minimal symmetric 
polynomials are involved in the multinomial expansion of (x + y + z) 10 ?(Two 
of them appear in Exercise 13.) 

16 It follows from Theorem 1.6.11 that the number of compositions of n having k 

or fewer parts is N(n - 1, k - 1) = C(n - 1, 0) + C(n - 1, 1) H h 

C(n — l,k— 1). By Theorem 1.7.5, there are C([n — l]+k,k— 1) different 
monomials in the multinomial expansion of (x\ + x-i + ■ ■ ■ + x^) n . It does not 
seem to follow, however, that N(n — l,k— 1) = C([n — l]+k,k— 1). With 
n = 6 and k = 3, iV(5, 2) = 16 while C([6 - 1] + 3, 3 - 1) = C(8, 2) = 28. 
Write out enough terms in the expansion of (x + y + z) 6 to explain where the 
numbers 16 and 28 come from. 

17 Use Theorem 1.5.1 and the binomial theorem to give another proof of the 
multinomial theorem. 

18 Exercise 14, Section 1.1, asks for an explicit listing of the 24 (exact) positive 
integer divisors of 360 = 2 3 3 2 5. Without doing any arithmetic, explain why the 
sum of these 24 divisors is given by the product (l+ 2 + 2 2 + 2 3 )x 
(1 +3 + 3 2 )(l +5). 

19 Suppose the prime factorization of n = p r iP2 ■■■/>*• Prove that the sum of the 
divisors of n is the product 




20 Explain how the binomial theorem can be used to prove that YTr=o P( r ) = 1> 
where P(r) — C(n, r)p r q n ~ r is the binomial probability distribution of Section 
1.3, Equation (1.5). 

21 For a fixed but arbitrary integer n > 2, define g(r) = Af[ r ](l,2, 1) = 

r + 2 r + --- + («-i) r . 

(a) Prove that £* =0 C(k + 1, r)g{r) = n k+l - 1. 
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(b) Given g(0),g(l), . . . , g(r), the equation in part (a) can be used to solve for 
g(r + 1). Starting from g(0) = n — 1, use this method to compute g(l), 
g(2), g(3), and g(4). 

(c) Compare and contrast with the approach suggested by Section 1.5, 
Exercise 11. 

(d) Explain the connection with Bernoulli numbers (Section 1.5, Exer- 
cises 20-22). 

22 Show that J^Zo c ( 50 > r)C (50 - r, 25 - r) = 2 25 C(50, 25). 

23 Compute 

(a) E'=25 C(50,r)C(r,25). 

(b) E^o(- 1 ) r C(50,r)C(50-r,25). 

24 Prove that the alternative view of distributivity used to prove the binomial and 
multinomial theorems is valid, i.e., suppose Si,S2,---,S„ are sums of 
algebraic terms. Prove that Si x S2 x • • • x S n is the sum of all products that 
can be obtained by choosing one term from each sum, multiplying the choices 
together, doing this in all o(S\) x 0(82) x ••• x o(S n ) possible ways, and 
adding the resulting products. (Hint: Induction on n.) 
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Something there is that doesn't love a wall. 

— Robert Frost {Mending Wall) 

In the last section, we grouped the C(n + k — 1 , w) different monomials from the 
multinomial expansion of (x\ + X2 + ■ ■ ■ + Xk) n into certain minimal symmetric 
polynomials with symbolic names like M[ 4]1 ] and A/^2,1]. 

1.8.1 Definition. A partition of n having m parts is an unordered collection of m 
positive integers that sum to n. 

1.8.2 Example. The number 6 is said to be perfect because it is the sum of its 
proper divisors: 6 = 1+2 + 3. Since addition is commutative, this sum could just 
as well have been written 2 + 3+1. In this context, 1+2 + 3 is the same as 

*A Christian theologian once argued that God, who could have created the universe in an instant, chose 
instead to labor for 6 days in order to emphasize the perfection of His creation. (It is just an accident that 
this book has 6 chapters.) 
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2 + 3 + 1 but different from 4 + 2. In expressing the prefection of 6, what interests 
us is the unordered collection of its proper divisors, the partition whose parts are 3, 



Two partitions of n are equal if and only if they have the same parts with the 
same multiplicities. By way of contrast, a composition of n (Definition 1.6.10) is 
an ordered collection of positive integers that sum to n. Compositions are some- 
times called ordered partitions. Two compositions are equal if and only if they 
have the same parts with the same multiplicities, in the same order. 

Our discussion of partitions will be simplified by the adoption of some notation. 

1.8.3 Definition. An m-part partition of n is represented by a sequence 
n = [jii, JI2, . . . , rc m ] in which the parts are arranged so that Tti > 712 > ■ ■ ■ > 
n m > 0. The number of parts is the length of n, denoted £(71) = m. The shorthand 
expression n h n signifies that "n is a partition of «". 

In ordinary English usage, arranging the parts of a partition from largest to smal- 
lest would typically be called "orderning" the parts. This semantic difficulty is the 
source of more than a little confusion. It is precisely because a partition is unor- 
dered that we are free to arrange its parts any way we like. The 5 cards comprising 
a poker hand can be arranged in any one of 5! = 120 different ways. But, no matter 
how the cards are arranged or rearranged, the poker hand is the same. So it is with 
partitions. A composition, on the other hand, is some specified arrangement of the 
parts of a partition. By convention (Definition 1.8.3), we uniformly choose one such 
composition to represent each partition. 

1.8.4 Example. The three-part partitions of 6 are [4, 1, 1], [3, 2, 1], and [2, 2, 2], 
There are 3 ways to arrange the parts of [4, 1, 1], 6 ways to arrange the parts of [3, 
2, 1], but only one way to arrange the parts of [2, 2, 2]. Taken together, these 
10 arrangements comprise the compositions of 6 having 3 parts (as illustrated in 



Already it seems convenient to introduce some additional shorthand notation. 
Rather than [4, 1, 1] and [2, 2, 2], we will write [4, l 2 ] and [2 3 ], respectively. Simi- 
larly, the partition [5, 5, 3, 3, 3, 3, 2, 2, 2, 1] is abbreviated [5 2 ,3 4 ,2 3 , 1]. In this 
notation superscripts denote, not exponents, but multiplicities. In the 10-part 
partition [5 2 , 3 4 , 2 3 , 1], the piece 3 4 contributes, not 3x3x3x3 = 81, but 
3 + 3 + 3 + 3 = 12 to the sum 



2, and 1. 



□ 



Fig. 1.6.2). 



□ 



5 + 5 + 3 + 3 + 3 + 3 + 2 + 2 + 2+1= 29. 



The m-part compositions of n were counted in Theorem 1.6.11. (They number 
C(n — 1, m — 1).) Counting the m-part partitions of n is not so easy. Let's begin by 
giving this number a name. 
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Figure 1.8.1. 


The partition triangle. 
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7.5.5 Definition. The number of m-part partitions of n is denoted p m (n). 

1.8.6 Example. From Example 1.8.4, ^3(6) = 3. The seven partitions of 5 are 

[5], [4,1], [3,2], [3,1 2 ] = [3,1,1], [2 2 ,1] = [2,2,1], [2, l 3 ] = [2,1,1,1], and 
[l 5 ] = [1, 1, 1, 1, 1], having lengths 1, 2, 2, 3, 3, 4, and 5, respectively. Hence, 
p 1 (5) = l, jP 2(5)=2,p3(5)=2 )j p4(5) = l > andp 5 (5) = l. ' □ 

Because [n] is the only partition of n having just one part and, at the other 
extreme, [1"] is the only partition of n having n parts, p\(n) = 1 = p„(n) for all 
n. If n > 2, then [2, 1"~ 2 ] is the only partition of n having length n — 1, so 
Pn-\{n) = 1 as well. 

The numbers p m (n) are displayed in the Pascal-like partition triangle of 
Fig. 1.8.1, where it is understood that p m (n) = 0 when m > n. What is needed is 
a Pascal-like relation that would allow the entries of this triangle to be filled in a 
row at a time. 

1.8.7 Theorem. The number of m-part partitions ofn is p m {n) = p m -\(n — 1)+ 
p m (n — m), 1 < m < n. 

Proof. If n is an m-part partition of n, then n m =1 or it doesn't. There are 
Pm-\(n — 1) partitions of the first kind. Because n <-> [711 — 1, 712 — 1, • • • , 7t m — 1] 
is a one-to-one correspondence between the m-part partitions of n satisfying n m > 1 
and the m-part partitions of w — m, there must be p m (n — m) partitions of the second 
kind. ■ 

From Theorem 1.8.7, p 2 (4) = p x (3) + p 2 {4 - 2) = p l (3)+p 2 (2) = 1 + 1 = 2. 
(The two-part partitions of 4 are [3, 1] and [2 2 ].) Similarly, p 2 (6) — p\(6) + 
p 2 (4) = 1 + 2 = 3, and ^4(6) = P3(5) + pa{2) = 2 + 0 = 2. This completes 
Fig. 1.8.1 through row 6. Rows 7-10 are completed in Fig. 1.8.2. 

1.8.8 Definition. Denote the number of partitions of n by p(n)=pi(n) + 
p 2 (n) H YPn(n). 
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10 



n 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 



1 

1 1 

2 1 

3 2 



Figure 1.8.2. The partition numbers p m (n). 

Just as the nth row sum of Pascal's triangle is 2", the total number of subsets of 
an M-element set, the wth row sum of the partition triangle is p(n), the total number 
of partitions of n. Summing, rows 9 and 10 of Fig. 1.8.2, e.g., yields the partition 
numbers p(9) = 30 and p(l0) = 42* 

If 7i is an m-part partition of n, its Ferrers diagram^ F{%), consists of n 
"boxes" arrayed in m left-justified rows, where the number of boxes in row i is 
Kj. The diagrams for [5,3 2 , 1] and [4,3 2 , l 2 ], e.g., appear in Fig. 1.8.3. 

1.8.9 Definition. The conjugate of n h n is the partition tt* h n whose y'th part is 
the number of boxes in the y'th column of F(%) . 

Because the number of boxes in row j of F(n*) is equal to the number of boxes 
in column j of F(n) for all j, the two diagrams are transposes of each other. In 



□ □□□□ 

□ □□ 

□ □□ 
□ 



F([5, 3 2 , 1]) 



□ □□□ 

□ □□ 

□ □□ 
□ 

□ 



F ([4, 3 2 , l 2 ]) 



Figure 1.8.3. Two Ferrers diagrams. 



*The partition numbers grow rapidly with n. MacMahon showed, e.g., that p(200) = 3, 972, 
999, 029, 388. 

f Named for Norman Macleod Ferrers (1829-1903) but possibly used earlier by J. J. Sylvester 
(1814-1897). 
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particular, partition a = 7r* if and only if a* = n. This situation is illustrated in Fig. 
1.8.3 for the conjugate pair [5,3 2 , 1] and [4, 3 2 , l 2 ]. 

The number of boxes in the y'th column of F(n) is equal to the number of rows of 
F(n) that contain at least j boxes, i.e., n* is equal to the number of parts of jr that are 
not less than j. Said another way, the y'th part of n* is 

n*=o({i : Ki>j}). (1.29) 



1.8.10 Theorem. The number ofm-part partitions of n is equal to the number of 
partitions of n whose largest part is m. 

Proof. If n is an m-part partition of n, then m is the number of boxes in the first 
column of F(n), i.e., m = n\, the largest part of ji*. Hence, in the one-to-one cor- 
respondence between partitions and their conjugates, the set of m-part partitions 
corresponds to the set of partitions whose largest part is m. ■ 

1.8.11 Definition. Partition n is self-conjugate if 7i* = jr. 

1.8.12 Example. Because jr = tt* if and only if F{n) = F(n*) = F(n)', the 
transpose of F(n), % is self-conjugate if and only if its Ferrers diagram is symmetric 
about the "main diagonal". Thus, merely by glancing at Fig. 1.8.4, one sees that 
[5,4,3,2, 1] and [5, l 4 ] are self-conjugate partitions. On the other hand, without a 
Ferrers diagram to look at, it is much less obvious that [5 2 , 4, 3, 2] is self-conjugate. 

□ 

Knowing something about partitions, we can now give a formal definition of 
"minimal symmetric polynomial". 

1.8.13 Definition. Let k and n be positive integers. Suppose jr is an m-part parti- 
tion of n.\fk> m, the minimal symmetric polynomial 



M K (x l ,x 2 , . . . ,x k ) = y^X;'x 



1 2 A k ' 



□ □□□□ □□□□□ 

□ □□□ □ 

□ □□ □ 

□ □ □ 

□ □ 
F([5,4, 3,2, 1]) F([5, l 4 ]) 

Figure 1.8.4. Two self-conjugate partitions. 
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where the sum is over all different rearrangements (r\, r-i, ■ . . , r^) of the /c-tuple 
(jti, 712, • • ■ , TC m , 0, . . . , 0) that is obtained by appending k — m zeros to the end of 
7i. If k < m, then M n (xi,X2, ■ . . ,je*) = 0. 

If, e.g., Tt = [jti , TC2] = [2, 2] and k = 3, the different rearrangements of 
(tci,7T2,0) are (2, 2, 0), (2, 0, 2), and (0, 2, 2), and wo? the six different-Zoofemg 
ways to rearrange the symbols 71!, tc 2 , and 0. In particular, 

M\ 2 .2] (*, y, z) = x 2 y 2 + x z z 2 + y 2 z 2 . 

If Tt h « and m = £(n) < k, then each monomial Xj'xj 2 ■ ■ ■ x r k k in Definition 1.8.13 

has (total) degree r\ + r-i + V = K\ + %2 + ■ ■ ■ + n m = n, i.e., M n (x\,X2, 

. . . ,Xk) is homogeneous of degree n. 

1.8.14 Example. From Fig. 1.8.2, there are p x (6) +p 2 (6) +^3(6) = 1 + 3+ 
3 = 7 different partitions of 6 having at most three parts. Hence, there are 7 differ- 
ent minimal symmetric polynomials of degree 6 in the variables x, y, and z, namely, 

M [6] (x,y,z) =x 6 +y 6 + z 6 , 
Af[ 5jl ] (x, y, z) = x 5 y + x 5 z + xy 5 + xz 5 + y 5 z + yz 5 , 
M [4 ,2] (x, y, z) = x 4 y 2 + x 4 z 2 + x 2 y 4 + xY + vV + y 2 z\ 
M [3 2] (x, y, z) = x 3 y 3 + xV + v 3 z 3 , 
M[ 41 2] (x, y, z) = x 4 yz + x/z + xyz A 7 

M [3,2,i] y, z) = xVz + x 3 yz 2 + x 2 y 3 z + x 2 yz 3 + xy 3 z 2 + xy 2 z 3 , 



and 



M [23] (x,y,z) =x 2 y 2 z 2 . □ 



Minimal symmetric polynomials are to symmetric polynomials what atoms are 
to molecules, they are the basic building blocks. 

1.8.15 Theorem. The polynomial f — /(xi,x 2 , ... is symmetric in xi,x 2 , 
. . . , Xk if and only if it is a linear combination of minimal symmetric polynomials. 

Proof. Because minimal symmetric polynomials are symmetric, any linear 
combination of minimal symmetric polynomials in x\,X2, ■ ■ . ,Xk is symmetric. 

Conversely, suppose cx^x^ ■ ■ -xf is among the nonzero terms of /(xi,x 2 , ... ,Xk). 
Then {s\,S2, ■■■ ,su) is a rearrangement of (oti, 0C2, • • • , oc m , 0, . . . , 0) for some 
partition a. Because / is symmetric, cxj'xj 2 ■ ■ ■ x r k l must occur among its terms 
for every rearrangement {r x , r 2 , . . . , r k ) of (oti, 0C2, • • • , a m , 0, . . . , 0), i.e., 
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cM a (x\,X2, ■ ■ ■ ,Xk) is a summand of /. Therefore, f(xi, X2, ...,**) — cM a {x\, 
X2,...,Xk) is a symmetric polynomial with fewer terms than/, and the result 
follows by induction. ■ 

1.8.16 Example. Let 

f(a,b,c,d) = 2a — ab — ac — ad — ab + abc + abd — ac + acd — ad 

+ 2b 3 - b 2 c - b 2 d - be 2 + bed - bd 2 + 2c 3 - c 2 d - cd 2 + 2d 3 . 

Probably the easiest way to confirm that this polynomial is symmetric is to express 
it as 

f(a, b, c, d) = 2M[3](a, b, c, d) — M[2,i\ («, b, c, d) + M^(a, b, c,d). □ 

There are, of course, easier ways to verify that the polynomial/(xi ,X2, ■ . . ,Xk) = 
(x\ + X2 + ■ ■ ■ + Xk) n is symmetric than by expressing it as a linear combination of 
minimal symmetric polynomials. On the other hand, because it is symmetric, 
f(x l ,x 2 ,...,x k ) is a linear combination of minimal symmetric polynomials. 
What combination? The answer to that question is what the multinomial theorem 
is all about: 

Oi +x 2 H \-x k ) n = \ 1 )M 1t (xi,x 2 ,...,x k ), (1.30) 

where the coefficient ( " ) is an abbreviation for the multinomial coefficient whose 

bottom row consists of the £(k) parts of n. (Recall that M n (x l ,x 2} . . . ,x k ) =0 
whenever k < £(n).) 

1.8.17 Example. Together with Example 1.8.14, Equation (1.30) yields 

(x + y + zf = M| 6| (x, y, z) + 6M [5jl] (x, y, z) + 15M [4j2 ] (x, y, z) 
+ 20M P 2] (x, y, z) + 30M| 4!l 2] (x, y, z) 
+ 60M[ 3A1| (x,y,z) + 90M [2 ,]{x,y,z). □ 

1.8. EXERCISES 

1 Explicitly write down 

(a) all 11 partitions of 6. 

(b) all 8 partitions of 7 having at most three parts. 

(c) all 8 partitions of 7 whose largest part is at most three. 
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2 Show that 

(a) p„- 2 (n) = 2, n > 4. 

(b) p n - 3 (n) = 3, n > 6. 

(c) for all n > 6, the last four (nonzero) numbers in row n of the partition 
triangle are 3, 2, 1, 1. 

(d) P2(n) = [n/2\, the greatest integer not exceeding |w. 

3 Compute rows 1 1-15 of the partition triangle. 

4 Evaluate 

(a) p(ll). (b) P (12). 
(c) p(13). (d) p(14). 

5 The number of partitions of n into three or fewer parts turns out to be the 
nearest integer to (n + 3) 2 . 

(a) Confirm this fact for 1 < n < 6. 

(b) Confirm this fact for 7 < n < 10. 

(c) Determine the number of different minimal symmetric polynomials, in 
three variables, of degee n = 27. 

6 How many different eight-part compositions can be produced by rearranging 
the parts of the partition 

(a) [5 3 ,4,2 4 ]? (b) [2 5 ,1 3 ]? 

(c) [8,7,6,5,4,3,2,1]? 

(Hint: Don't try to write them all down.) 

7 Confirm, by writing them all down, that there are pi(9) four-part partitions 
7t h 10 that satisfy 714 = 1. 

8 Confirm Theorem 1.8.10 for the pair 

(a) n = 5 and m = 2. (b) n = 6 and m = 3. 
(c) n = 10 and m = 3. (d) n = 10 and m = 5. 

9 Prove that the partition number p(n) > 2Lv / "J for all sufficiently large n. 

10 Exhibit Ferrers diagrams for all the self-conjugate partitions of 
(a) 6. (b) 10. (c) 17. 

11 Let p 0 id{n) be the number of partitions of n each of whose parts is odd and 
Pdist(n) be the number of partitions of n having distinct parts. It is proved in 
Section 4.3 that p 0 d&{n) = /?dist(") for all n. Confirm this result now for the 
case 

(a) n = 5. (b) n = 6. (c) n = 1. (d) n = 8. 
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12 The first odd "abundant" number is 945. 

(a) How many positive integer divisors does 945 have? 

(b) Sum up the "proper" divisors of 945 (those divisors less than 945). 

(c) What do you suppose an "abundant" number is? 

13 Prove that the number of partitions of n with at most m parts is equal to the 
number of partitions of n + m with exactly m parts, i.e., prove that 

m 

^2pk(n) =p m (n + m) 
k=i 

(a) by induction on m. 

(b) by means of Ferrers diagrams. 

14 Prove that 

(a) p m (n) = p m (n - m) + p m -i(n — m) H \-p\{n-m), m < n. 

(b) p(n) = p n {2n). 

(c) p(n) = p n+m (2n + m), m>0. 

(d) For all n > 8, the last five (nonzero) numbers in row n of the partition 
triangle are 5, 3, 2, 1, 1. 

(e) What is the generalization of Exercises 2(c) and 14(d)? 

15 Suppose a = [oti, 0C2, . . . , a m ] and (3 = [P 1; P 2 , . . . , are two partitions of n. 
Then a majorizes P if m < k and 

r r 

l<r<m. 

i=l i=\ 

(a) Show that [6, 4] majorizes [4, 3, 2, 1]. 

(b) Show that [4, 3, 2, 1] majorizes [3 2 ,2 2 ]. 

(c) If a majorizes P and P majorizes y, prove that a majorizes y. 

(d) Prove that a. majorizes P if and only if P* majorizes a*. 

16 Confirm that the coefficients 1, 6, 15, 20, 30, 60, and 90 in Example 1.8.17 are 
all correct. 

17 Prove that the number of self-conjugate partitions of n is equal to the number 
of partitions of n that have distinct parts each of which is odd. 

18 The great Indian mathematician Srinivasa Ramanujan (1887-1920) proved a 
number of theorems about partition numbers. Among them is the fact that 
p(5n + 4) is always a multiple of 5. Confirm this fact for n — 0, 1, and 2. 
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19 We saw in Section 1.6 that the equation a + b + c + d + e= 10 has a total of 
C(9,4) = 126 different positive integer solutions. Of these, how many satisfy 

a>b>c>d>el 

20 Denote by t(n) the number of partitions of n each of whose parts is a power 
of 2 (including 2° = 1). 

(a) Compute t(n), 1 < n < 6. 

(b) Prove that t(2n + 1) = t{2n), n>\. 

(c) Prove that t(2n) = f(n) + t(2n - 2), « > 2. 

(d) Prove that t(n) is even, n > 2. 

21 When p(a, b,c,d) = (a + b + c + d) 10 is expressed as a linear combination of 
minimal symmetric polynomials, compute the coefficient of 

(a) M [% ^(a,b,c,d). (b) M [10 ](a,b,c,d). 

(c) M| 3 2 2 2](a,fr,c,c0. (d) M [3 2 j2t i2]{a,b,c,d). 

22 Compute the coefficient of 

(a) M[ 2 ^](xi,x 2 ,X3,X4,x 5 ,x 6 ) in (xi +x 2 + x 3 +x 4 + x 5 + x 6 ) 5 . 

(b) M[ 2i i3](xi,X2,X3,X4,x 5 ) in (xi + x 2 + x 3 + x 4 + x 5 ) 5 . 

23 Express p(x, y,z) as a linear combination of minimal symmetric polynomials, 
where 

(a) p(x, y, z) = 5x 2 + 5y 2 + 5z 2 —xy — xz — yz. 

(b) p(x, y, z) = 2x(l + 2yz) - 3x 2 + 2y- 3y 2 + 2z~ 3z 2 . 

24 Write out, in full, 

(a) M [5] (w,x,y,z). (b) M [4A] (w,x,y,z). 

(c) M [l3] (w,x,y,z). (d) M |8il| (x,v,z). 

(e) M [3i2jl] (x,y,z). (f) M [3il 2](x,);,z). 

25 Theorem 1.8.15 can be use to custom design symmetric polynomials. The 
homogeneous symmetric function of degree n is defined by Hq(xi,X2, 

Xk) — 1 and 



where, recall, M n (xi,X2, ■ ■ ■ ,xt) —0 whenever £(n) > k. Explicitly write out 
all the terms in 

(a) H 2 (x,y). (b) H 3 (x,y). 

(c) H 2 {a,b,c). (d) H 3 (a,b,c). 

26 Let H n (x l ,x 2 , . . . ,x k ) be the homogeneous symmetric function defined in 
Exercise 25. 




n > 1 
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(a) Compare and contrast H n (x\,x 2 , . . . ,x k ) with (x\ + x 2 + ■ ■ ■ + x k ) n . (Hint: 
See Equation (1.30).) 

(b) Show that H n (x l ,x 2 , . . . ,x k ) is the sum of pi(n) + p 2 (n) + V p k (n) 

different minimal symmetric polynomials. 

(c) Prove that H n (x\,x 2 , . . . ,x k ) is the sum of C(n + k — 1, n) different terms. 
(Hint: Theorem 1.7.5.) 

(d) Prove that H„(xi,x 2 , . . . ,x k ) = H n (x\,x 2 , . . . ,x k -i) + x k H n ^(x\,x 2 , . . . , 
x k ). 

(e) Prove that H s (x u x 2} . . . ,x n ) - H s (x 2 , . . . ,x„,x n+1 ) = (x\ - x n+ i)H s _i 
(xi,x 2 , . . . , X n+ i). 

27 Suppose m is a nonnegative integer. A lattice path of length m in the cartesian 
plane begins at the origin and consists of m unit "steps" each of which is 
either up or to the right. If s of the steps are up and r = m — s of them are to 
the right, the path terminates at the point (r,s). "Directions" for the lattice 
path illustrated in Fig. 1.8.5 might go something like this: Beginning from 
(0, 0) (the lower left-hand corner), take two steps up, two to the right, one up, 
three right, one up, one right, and one up. If this grid were a street map and one 
were in the business of delivering packages, lattice paths would probably 
the called "routes", and these directions might be given in shorthand as 
UURRURRRURU. Suppose r and s are fixed but arbitrary nonnegative 
integers, with r + s > 0. 

(6, 5) 



(0, 0) 

Figure 1.8.5 

(a) Compute the number of different lattice paths from (0,0) to (r, s). 

(b) The lattice path in Figure 1.8.5 "partitions" the 5 x 6 grid into two pieces. 
In this case, the piece above the path might easily be mistaken for the 
Ferrers diagram of partition n = [6, 5, 2]. Use this observation to compute 
the number of partitions that have at most s parts each of which is at most r. 
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(c) As an alternative to the alphabet {R, U}, one could just as well encode lattice 
paths using, say, the horizontal displacement of each step. In this scheme, each 
vertical step would correspond to a 0 and each horizontal step to a 1. For 
example, the lattice path in Fig. 1.8.5 would be encoded as the binary word 
00110111010, a word of length 11 and "weight" 6. Compute the number of 
different binary words of length r + s and weight r. 

(d) Consider a binary word w — b\b 2 . . . b m of length m consisting of the letters 
(bits) b\ , b2, ■ ■ ■ , b m . The inversion number inv(Z?,) — 0 if b, = 1; if b, = 0, it is 
the number of l's to the left of If, e.g., u = 001 101 1 1010 (corresponding to 
Fig 1.8.5), the inversion numbers of its bits are 0, 0, 0, 0, 2, 0, 0, 0, 5, 0, and 6, 
respectively. In this case, the nonzero inversion numbers of u are precisely the 
parts of the corresponding partition 7i from part (b). Show that, in general, the 
nonzero inversion numbers of the bits of w are the parts of the partition to 
which w corresponds. 

28 Galileo Galilei (1564-1642) once wondered about the frequency of throwing 
totals of 9 and 10 with three dice. 

(a) Show that 9 and 10 have the same number of 3 -part partitions each of 
whose parts is at most 6. 

(b) Explain why it does not follow that 9 and 10 occur with equal frequency 
when three dice are rolled (repeatedly). 

29 Suppose 7i is an m-part partition of n. Show that the number of different 
compositions of n that can be obtained by rearranging the parts of n is 
multinomial coefficient 
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What immortal hand or eye could frame thy fearful symmetry? 

— William Blake (Songs of Experience) 

Let's begin by exploring the relationship between the coefficients of a monic poly- 
nomial 

p{x) = x n + ax"' 1 + c 2 x"- 2 + • • • + c„ (1.31a) 
and its roots a\ , ■ ■ ■ , a n . Writing p(x) in the form 



p(x) = (x — a\)(x — a 2 ) ■ ■ ■ (x — a n ) 



(1.31b) 
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suggests mimicking the alternative view of distributivity used to prove the binomial 
theorem, i.e., select one of x or — a\ from the first set of parentheses, one of x or —a 2 
from the second set, and so on. Finally, choose one of x or — a n from the nth set. 
String these selections together, in order, so as to create an w-letter "word", some- 
thing like 

(— a\)xxx(— a 5 )x. . .xx. 

If the total number of x's in this word is n — r, then the remaining "letters" are of 
the form (—a,) for r different values of i. 

The sum of all such words is an inventory of the 2" ways to make the sequence 
of decisions. Replacing each word with a monomial of the form 

(-l) r ( fll a s ■■■)*»-' 

and combining terms of the same degree (in x) should yield Equation (1.31a). So, 
the coefficient of x"~ r in Equation (1.31a) must be the sum of all possible terms of 
the form 

{-\) r a h a k ■ 

where 1 < i\ < i 2 < ■ ■ ■ < i r < n. In other words, c r is (— l) r times the sum of the 
products of the roots taken c at a time. Let's give that sum a name. 

7.9.7 Definition. The rth elementary symmetric function 

E r (x\ , X2, . . . , x w ) 

is the sum of all possible products of r elements chosen from {xi,X2, . . . ,x n } with- 
out replacement where order doesn't matter. 

Evidently, E r (x 1 ,X2, . . . ,x n ) is the sum of all C(n, r) "square-free" monomials 
of (total) degree r in the variables X\,X2, . . . ,x n . Our conclusions about the relation- 
ship between roots and coefficients can now be stated as follows. 

1.9.2 Theorem. Let ai,02, . . . ,a„ be the roots of a monic polynomial p(x) = 
x n + Cix" -1 + c 2 x"- 2 H Vc n . Then 

c r = (-l) r E r (a u a 2 , . . . ,a„), 1 < r < n. (1-32) 

1.9.3 Example. Suppose/(x) = x 4 — x 2 + 2x + 2. Then, counting multiplicities, 
f(x) has four (complex) roots; call them a\, a 2 , 03, and 04. Setting E r = 
E r (a\,a2, 03,(24) and comparing the actual coefficients of f(x) with the generic 
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formula /(x) = x 4 — £ix 3 + E^x 1 — E3X + £4, we find that 

0 = E\(a\, 02,03,(14) = a\ + a 2 + 03 + a 4 , 
— 1 = Ei(a\, a2, 03, an) = a\a 2 + 0103 + a\04 + a 2 a^ + 0204 + 0304, 
—2 = E-}(a\, 02,03, 04) = 010203 + a\020\ + 010304 + a2«3<24, 

2 = £4(0] , fl2i A3, 04) = aifl2 fl 3 fl 4- 

So, just from its coefficients, we can tell, e.g., that the sum of the roots of/(x) is 0 
and that their product is 2. □ 

7.9.4 Example. Suppose a t = 1, 1 < i < n, so that 

p(x) = (x-l) n 

= C{n,0)x" - C(n, l)*"" 1 + C(n,2)x"- 2 + {-\) n C(n,n). 

In this case, £ r (l, 1, . . . , 1) = C(n, r), 1 < r < n, which makes perfect sense. After 
all, E r (a\, 02, . . . , a n ) is the sum of all C(n, r) products of the a,'s taken r at a time. 
If a, = 1 for all i, then every one of these products is 1, and their sum is 
E r {\,\,...,\) = C{n,r). ' □ 

Consistent with the fact that the leading coefficient of a monic polynomial is 1, 

we define £o(xi,X2, ■ ■ ■ ,x n ) = 1. 

7.9.5 Example. If a, = i, 1 < i < 4, then 

£o(l,2,3,4)- 1, 

£i(l,2,3,4) = 1+2 + 3 + 4= 10, 

£2(1,2,3,4) = 1x2+1x3 + 1x4 + 2x3 + 2x4 + 3x4 = 35, 
£3(1,2,3,4) = 1x2x3+1x2x4+1x3x4 + 2x3x4 = 50, 
£4(1,2,3,4) = 1x2x3x4 = 24. 

If p(x) — (x — l)(x — 2)(x — 3)(x — 4), then, with the abbreviation E r = E r 
(1,2,3,4), 0 < r < 4, Theorem 1.9.2 yields 

p(x) — £ox 4 — £ix 3 + £2X 2 — £3X + £4 
= x 4 - 10x 3 + 35x 2 - 50x + 24. 

Let's confirm this directly: 

p{x) = (x - l)(x - 2)(x - 3)(x - 4) 
= (x 2 - 3x + 2)(x 2 -7x+ 12) 

= x 4 - (7 + 3)x 3 + (12 + 21 + 2)x 2 - (36 + 14)x + 24. Q 

Apart from their intrinsic significance, elementary symmetric functions have 
important (and, in some cases, unexpected) connections with other combinatorial 
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objects. Recall, e.g., that the number of ways to choose n + 1 items from an 
m-element set without replacement where order matters is 

P(m,n + 1) = m(m — \){m — 2) • • • (m — n). 

1.9.6 Definition. The falling factorial function is defined by x' 0 ' = 1 and 

x ( n+l ) =x(x- l)(x-2)---(x-n), n>0. 

Since x'" +1 ' is a polynomial of degree n + 1, whose roots are 0,1, ... ,n, and 
because E r (0, 1, . . . ,«) = E r (l,2, . . . , n), 0 < r < n, it follows that 

x («+i) = ^+\ - El (l,2, . . . ,n)x" + E 2 (l,2, . . . ,n)x n - 1 

+ (-\) n E n (\,2, . . .,n)x. 

In particular, 

P(m, n + 1) = m[m n - E x (1, 2, . . . , «)m"- 1 + E 2 (l, 2, . . . , n)m n - 2 

+(-l)X(l,2,...,«)]. 

Let's take a brief excursion* and investigate the numbers E,(l,2,..., n). 

1.9.7 Definition. The elementary number 



e(M) = {1(1,2,..., n 



t < 0 or t > n, 
), 0<f<w. 



Apart from Example 1.9.5, where we computed 

(x -l)(x- 2)(x - 3)(x - 4) = x 4 - 10x 3 + 35x 2 - 50x + 24 



we know that 



= x 4 - e(4, l)x 3 + e(4, 2)x 2 - e(4, 3)x + e(4, 4), 

e(n,0) = £0(1,2, ...,«) 
= 1, 

e(n, 1) =Ei(l,2,...,n) 
= 1+2 + --- + /I 

= ^n(n + 1), 

e(n,n) = ^(l^, ...,«) 
= 1 x 2 x • • • x n 
= n!. 

'There is a serious side to this excursion. In Chapter 2, we will discover that s(n, r) = 
£„_ r (l,2, ...,n— 1) is a Stirling number of the first kind. 
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Figure 1.9.1. Elementary triangle. 
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This gives us a start at filling in some entries of the elementary triangle exhibited 
in Fig. 1.9.1. What is (momentarily) missing is a recurrence for the elementary 
numbers analogous to Pascal's relation for binomial coefficients and/or to Theorem 

1.8.7 for partition numbers. 

1.9.8 Lemma. If n> t > 1, then 

e(«, t) = e(n — l,f) + ne(n — l,t — 1). 

Proof: E t {\, 2, . . . , n) — e(n, t) is the sum of all C(n, t) products of the numbers 
1, 2, . . . , n taken t at a time. Some of these products involve n, and some do not. The 
sum of the products that do not involve n is E t (l, 2, . . . ,n — 1) = e(n — l,t). When 
n is factored out of the remaining terms, the other factor is E t -\(\, 2, . . . , 
n- 1) = e(n- l,t- 1). ■ 

From Fig. 1.9.1 and Lemma 1.9.8 we see, e.g., that 

e(3,2) =e(2,2) + 3e(2,l) 
=2+3x3 
= 11. 

e{5,2) =e(4,2) + 5e(4,l) 
= 35 + 5 x 10 
= 85, 

e(5,3) =e(4,3) + 5 x e(4,2) 
= 50 + 5 x 35 
= 225. 



Similarly, 



and 



Continuing in this way, a row at a time, one obtains Fig. 1.9.2. 
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Figure 1.9.2. The elementary numbers e(n, t). 



As their name implies, elementary symmetric functions are symmetric. Because 
multiplication is commutative, the coefficients of 



p(x) = {x- 1)0 - 2)(x - 3)0 - 4) 



are identical to the coefficients of 



p(x) = 0 - 3)0 - 1)0 - 4 )0 - 2 ); 

the sum of the products of Xi,X2, ■ ■ ■ ,x n taken t at a time is equal to the sum of the 
products of any rearrangement of the x's, taken t at a time. In fact, elementary sym- 
metric functions are minimal symmetric polynomials! 

1.9.9 Theorem. The tth elementary symmetric function is identical to the mini- 
mal symmetric polynomial corresponding to the partition [V], i.e., 

M[ V ](X!,X2, ■ ■ . ,x n ) = E,(xi,x 2 , ■ ■ ■ ,x n ). 

Proof. If (ri, r 2 , . . . , r n ) is some rearrangement of the sequene (1, 1, . . . , 1, 0, 
0, . . . , 0) consisting of t l's followed by n — t 0's, then 

Xy X2 ' X n X^Xi 2 ■ ■ ■ Xj t , 

where 1 < z'i < i 2 < ■ ■ ■ < i t < n, r it = r, 2 = • • • = r it = 1, and the rest of the r's 
are zero. Adding the monomials corresponding to all possible rearrangements of 
(1,1,..., 1,0,0,..., 0) yields 



M[ V ] (x u x 2 ,...,x n ) = ^2 x h x i 2 ■ ■ ■ X U . 



(1.33) 
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where the sum is over 1 < i\ < i 2 < ■ ■ ■ < i t < n. In other words, the right-hand 
side of Equation (1.33) is the sum of all C(n, t) products of the x's taken t and a 
time, which is the definition of E t (xi,x 2 , ■ ■ ■ ,x„). ■ 

Conjugate to [1'] is the partition [t], 

1.9.10 Definition. The minimal symmetric polynomial corresponding to [t] is 
the fth power sum, abbreviated 

M,(xi,x 2 , ■ • • ,x n ) = M M (xi,x 2 , . . . ,x n ) 

= -\- X2 ~\~ ' ' ' ~h X n . 

If t = 1, then 

M l (xi,x 2 , . . . , x n ) = x\ + x 2 H h x n 

= Ei(xi,x 2 ,...,x„). (1.34) 

Our interest in power sums goes back to Section 1.5, where it was discovered, 
e.g., that 

Mi(l,2,...,;i) = 1 + 2 + ---+M 
= i«(n+l), 

M 2 (l,2,...,n) = l 2 + 2 2 + --- + n 2 

= in(n+l)(2n+l), (1.35) 

M 3 (l,2,...,n) = l 3 + 2 3 + --- + « 3 

= \n 2 (n+l) 2 , (1.36) 

and so on. 

Recall (Theorem 1.8.15) that a polynomial in n variables is symmetric if and 
only if it is a linear combination of minimal symmetric polynomials. In this sense, 
the minimal symmetric polynomials are building blocks from which all symmetric 
polynomials can be constructed. The power sums are also building blocks, but in a 
different sense. The following result is proved in Appendix Al. 

1.9.11 Theorem. Any polynomial symmetric in the variables 
polynomial in the power sums M, — M,(x l ,x 2 , . . . ,x n ), 1 < t < n. 



*To be encountered in Section 3.6, the symmetric "pattern inventory" is a polynomial in the power sums. 
A description of that polynomial is the substance of Polya's theorem. 
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1.9.12 Example. We do not need Theorem 1.9.11 to tell us that p(x,y,z) — 
(x + y + z) 3 as a polynomial in the power sums. By definition, p(x, y, z) — 
Mi(x,y,z) 3 . What about something more interesting, like M[ 2j i](x,y, z) = x 2 y+ 
x 2 z + xy 2 + xz 2 + y 2 z + yz 2 ? Observe that the product 

M 2 (x,y,z)M 1 (x,y,z) = (x 2 + y 2 + z 2 )(x + y + z) 

= x 3 + y 3 + z 3 + x 2 y + x 2 z + xy 2 + xz 2 + y 2 z + yz 2 
= M 3 (x,y,z) +M [2 ,i\(x,y,z). 



So, M [2t i](x,y,z) = M 2 (x,y,z)M 1 (x,y,z) -M 3 (x,y,z). Similarly, 

M 2 (x,y,z) 2 = (x 2 +y 2 + z 2 ) 2 

= x 4 + / + z 4 + 2x 2 y 2 + 2x 2 z 2 + 2y 2 z 2 
= M 4 (x, y, z) + 2M |2 ,2] (x, y, z) , 

so that M [2j2] (x, y,z) = \ [M 2 {x, y, zf - M 4 (x, y, z)}. □ 



1.9.13 Example. Let's see how to express elementary symmetric functions as 
polynomials in the power sums. Already having observed that Ei(x, y, z) = 
M\ (x, y, z), consider E 2 (x, y, z) = xy + xz + yz. Rearranging terms in 

Mi(x,y,z) 2 = (x + y + z) 2 

= (x 2 + y 2 + z 2 ) + (2xy + 2xz + 2yz) 
= M 2 (x,y,z) +2E 2 (x,y,z) 



yields 

E 2 (x,y,z) = 1 2 [Mi (x, y , zf - M 2 (x, y , z)} . (1.37) 

Similar computations starting from M\ (x, y, z) 3 = (x + y + z) 3 lead to the identity 

E 3 {x,y,z) =i[M!(x,y,z) 3 - 3M 1 (x,y,z)M 2 (x,y,z) + 2M i (x,y,z)}. (1.38) 



(Confirm it.) 



□ 



1.9. Elementary Symmetric Functions 



95 



Surely, Equations (1.37) and (1.38) are examples of some more general relation- 
ship between power sums and elementary symmetric functions. To discover what 
that pattern is, let's return to the source. Suppose, e.g., that 

p(x) = (x — a\){x — a 2 ) ■ ■ ■ (x — a n ) 

= x n - E lX "- 1 + E 2 x"- 2 -■■■ + {-l) n E n , 

where E r = E r (ai,a 2 , ■ ■ ■ ,a n ). Substituting x — a, in this equation yields 

0 = p(a,-) 

= d\ - E x a n r x + E 2 a1- 2 -■■■ + {-l)"E n . 
Summing on i and setting M, — M,(a 1 , a 2 , . .. ,a n ) = a[ + a' 2 + • • • + a' n , we obtain 

0 = M„ - E\M n _\ + £ 2 M„_ 2 - • • • + {-\) n nE n , 
the t — n case of the following. 

1.9.14 Newton's Identities. For a fixed but arbitrary positive integer n, let 
M r = M r (x\,X2, ■ ■ ■ ,x„) and E r = E r (x\,X2, ■ ■ ■ ,x„). Then, for all t > 1, 

M, — M t -\E\ +M t - 2 E 2 + (-l^MjE,-] + (-l)'tE, = 0. (1.39) 



1.9.15 Example. The first four of Newton's identities are equivalent to 

M\ = Ei, 
M 2 -M x Ei = -2E 2 , 
M 3 - M 2 Ei +MiE 2 = 3E 3 , 
M 4 - M 3 E X + M 2 E 2 -MiE 3 = -4E 4 . 

The first identity, Mi = E\, is the same as Equation (1.34). Substituting Mi forEi in 
the second identity yields 

E 2 = l -[M\-M 2 ], (1.40) 

extending to n variables and confirming Equation (1.37). Eliminating E x and E 2 
from the third identity recaptures the following extension of Equation (1.38): 

E } = 1 [M\ -3MiM 2 + 2M 3 ]. (1.41) 
6 

'Named for Isaac Newton (1642-1727). 
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Eliminating E\, E 2 , and £3 from the fourth identity produces something new, 
namely, 

EA = ~ Ml + 8MlM3 + 3M 2 ~ 6M *1 ' ( 1 - 42 ) 

Evidently, Newton's identities can be used to express any elementary symmetric 
function as a polynomial in the power sums. □ 

Because Ei(x\,x 2 ) = 0, the right-hand side of Equation (1.41) had better be zero 
when n = 2. Let's confirm that it is: 

M\ + 2M 3 = (xi + x 2 ) 3 + 2{x\ + x\) 

= 3x 3 + 3xjX2 + 3x\x\ + 3x\ 
= 3{xi +x 2 ){x\ + x\) 
= 3M X M 2 . 

So, as predicted, M\ — 3M\M 2 + 2M3 = 0. More generally, because E n+r (x\, 
x 2 , ■ ■ ■ ,x n ) = 0, r > 1, Equation (1.39) has a simpler form when t > n, namely, 

M t - M t _\E\ +M,_ 2 E 2 + {-l)"M t _„E n = 0. (1.43) 

A proof of Newton's identities for all t > 1 can be found in Appendix Al. 

1.9. EXERCISES 

1 Without computing the roots of f(x) = x 4 — x 2 + 2x + 2, it was argued in 
Example 1.9.3 that their elementary symmetric functions are E\ = 0, 
E 2 = — 1, £3 = —2, and E4 = 2. Confirm this result by finding the four roots 
and then computing their elementary symmetric functions directly from the 
definition. 

2 Show that (a 2 + b 2 ) — (a + b){a + b) + 2ab = 0 (thus confirming the n = t = 
2 case of Newton's identities). 

3 Find the elementary symmetric functions of the roots of 

(a) x 4 - 5x 3 + 6x 2 - 2x + 1. (b) x 4 + 5x 3 + 6x 2 + 2x + 1. 
(c) x 4 + 5x 3 - 6x 2 + 2x - 1 . (d) 2x 4 + 10x 3 - 12x 2 + 4x - 2. 
(e) x 5 - x 3 + 3x 2 + 4x - 8. (f) x 5 + x 4 - 2x. 

4 Compute 

(a) £,(1,2,3,4,5), 1 < t < 5, directly from Definition 1.9.1 (Hint: Use row 5 
of Fig. 1.9.2 to check your answers.) 
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(b) £5(1,2,3,4,5,6,7). 

5 Find the missing coefficients in 

(a) x< 5 ' = x 5 - lOx 4 + 35x 3 - x 2 + x - . 

(b) x< 6 ) = x 6 - x 5 + x 4 - 225x 3 + x 2 - x. 

6 Compute 

(a) £3(1,2,3,4,5,6,7,8). (b) £4(1,2,3,4,5,6,7,8). 

(c) £ 6 (1,2,3,4,5,6,7,8). (d) £ 7 (1, 2, 3, 4, 5, 6, 7, 8). 

7 Let f(x) — box" + bix"~ l + • • • + b n -\x + b n be a polynomial of degree n 
whose roots are a\,a 2 , ■ ■ ■ ,a n - Prove that E t (a\,a 2 , ■ ■ ■ ,a n ) = (—l)'b t /bo. 

8 Confirm that 6(abc + abd + acd + bed) = M\ - 3MiM 2 + 2M 3 , where 
M, = a' + b> + c< + d>, 1 < t < 3. 

9 Newton's identities were used in Equations (1.40)-(1.42) to express 
E, = E t (x\,X2, ■ ■ ■ ,x n ) as a polynomial in the power sums M, = M t (x\, 
x 2 , . . . ,x„), 2 < t < 4. 

(a) Confirm by a direct computation that 

a 2 + b 2 + c 2 + d 2 = E\ (a, b, c, d) 2 — 2E 2 {a, b, c, d). 

(b) Show that M 2 =E\- 2E 2 for arbitrary n. 

(c) Express M3 as a polynomial in elementary symmetric functions. 

(d) Show that M 4 = E\ - 4E 2 E 2 + 4£i£ 3 + 2£^ - 4£ 4 . 

(e) Prove that any polynomial symmetric in the variables xi,x 2 , . . . ,x n is a 
polynomial in the elementary symmetric functions E,(xi,x 2 , . . . ,x n ), 
1 < t < n* 

10 Express the symmetric function f(a,b,c,d) from Example 1.8.16 as a 
polynomial in power sums. 

11 Express x 3 y + xy 3 as a polynomial in 

(a) Mi(x,y) and M 2 {x,y). (b) E x {x,y) and E 2 (x,y). 

12 Because equations like those in Exercises 9(b)-(d) are polynomial identities, 
any numbers can be substituted for the variables Xi,x 2 , . . . ,x„. 

(a) Use this idea to show that l 2 + 2 2 H h n 2 = e(n, l) 2 - 2e(n, 2). 

(b) Use Fig. 1.9.2 and the result of part (a) to evaluate l 2 + 2 2 + 3 2 + 4 2 + 5 2 . 
(Confirm that your answer is consistent with Equation (1.35).) 

(c) Find a formula for l 3 + 2 3 + h n 3 in terms of e(n, t), t < n. (Hint: Use 

your solution to Exercise 9(c).) 



This is the so-called Fundamental Theorem of Symmetric Polynomials. 
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(d) Use Fig. 1.9.2 and the result of part (c) to evaluate l 3 + 2 3 + 3 3 + 4 3 + 5 3 . 
(Confirm that your answer is consistent with Equation (1.36).) 

13 Let E, = E,(a\ , a 2 , ■ ■ ■ , a n ), 0 < t < n. Show that 

(a) (a, - l)(a2 -!)••• (a, -l)=E n - £„_i + E n _ 2 + (-l)"£b. 

(b) (1 -aix){\ -a 2 x) ■■■ (1 - a„x) = E 0 - E x x + E 2 x 2 h (-l)"E n x n . 

14 If n > t > 2, prove that 

E t (a\,a2, ■ ■ ■ ,a n ) = E t (a\,d2, ■ ■ ■ ,a n -i) + a n E t -\(a\,ai, . . . ,a„-i). 
(Hint: See the proof of Lemma 1.9.8.) 

15 Give the inductive proof that 

n n 

Q(x - a,) = ^{-\)'E t (a\ ,a 2 ,..., a n )x n ^'. 

16 lff(x) = x^ n+l \ show that/'(0) = ±n\. 

17 Show that 

(a) x( ffl +"' =x('"'(x-m) ( " ) . 

(b) (x + y) W = ELo C(n,r)^)y(»-'). 

18 Recall (Section 1.8, Exercise 15) that if a = [oti, 0C2, . . . , a m ] and 
P = [P lt p 2 , . . . , Pfc] are two partitions of n, then a majorizes P if m < and 

r r 

2 a, > Pi. 1 < r < m. 
/— 1 /— 1 

(a) Show that majorization imposes a linear order on the /?3(8) = 5 partitions 
of 8 having three parts. 

(b) Among the many properties of elementary symmetric functions is Schur 
concavity, meaning that E,(a) < E,(fi) whenever a majorizes p. Confirm 
this property for 2 < t < 3 using the three-part partitions of 8. 

(c) If you were to compute £3 (a) for each four-part partition a of 24, which 
partition would produce the maximum? (The minimum?) 

19 Let H t = H t (xi,X2, ■ ■ ■ ,x n ) be the homogeneous symmetric function of 
Section 1.8, Exercise 25. Then H, is Schur convex, meaning that 
H,(a) > // f (P), whenever a majorizes p. 

(a) Confirm this result for H 2 and the three-part partitions of 8. 

(b) If you were to compute //4(a) for each three-part partition a of 24, which 
partition would produce the maximum? (The minimum?) 
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20 Show that the general formula for E, as a polynomial in the power sums M, is 
t\E, = det(L f ), where 

/ M\ 

M 2 
M 3 

L, = 



1 


0 


0 


• • 0 


0 


Mi 


2 


0 


• • 0 


0 


M 2 


Mi 


3 


• • 0 


0 



M f _ 2 

M f _i 



M_ 3 



M f _ 4 

M r _ 3 



Mi 
M 2 



t- 1 
M, / 



(Hint: Use Cramer's rule on the following matrix version of Newton's 
identities: 



( 1 


0 


0 


0 








/Mi\ 


Mi 


-2 


0 


0 




E 2 




M 2 


M 2 


-Mj 


3 


0 




£3 




M 3 


M 3 


-M 2 


Mi 


-4 




£4 




M 4 










J 


V : ) 




V : J 



21 Confirm that the result in Exercise 20, i.e., t\E, = det(L,), agrees with 

(a) Equation (1.40) when t = 2. 

(b) Equation (1.41) when t = 3. 

(c) Equation (1.42) when t = 4. 

22 Bertrand Russell* once wrote, "I used, when excited, to calm myself by 
reciting the three factors of a 3 + b 3 + c 3 - 3abc." 

(a) Express a 3 + b 3 + c 3 — 3abc as a product of two nontrivial polynomials 
that are symmetric in a, b, and c. (Hint: Example 1.9.15 and Mi (a, b, c) = 

E\ (a,b,c).) 

(b) Show that (a + b + c)(a + Qb + 8 2 c)(a + Q 2 b + 8c) = a 3 + b 3 + c 3 - 
3abc, where 9 = i(— 1 + is a primitive cube root of unity. 

(c) Show that if a 3 + b 3 + c 3 — 3abc is a product of three polynomials, each 
of which is symmetric in a, b, and c, then one (at least) of them is a 
constant polynomial. 

23 Prove that 

(a) e(n,2) = C(n+ 1,2). 

(b) e{n, 3) = i (n - 2)(n - l)n 2 (n + l) 2 . 



*In 1914, having completed Principia Mathematica with Alfred North Whitehead, Bertrand Russell 
(1872-1970), Third Earl Russell, abandoned mathematics in favor of philosophy, social activism, and 
writing. He was awarded the Nobel Prize for Literature in 1950. 
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24 Show that {*W : 0 < n < m) = { 1 , x, x' 2 ' , x* 3 ' , . . . ,x< m '} is a basis for the 
vector space of polynomials of degree at most m. (Hint: Show that any 

polynomial f(x) = b m x m + b m -\x m ~ l + h bo of degree at most m can be 

expressed (uniquely) as a linear combination of 1 , x, x' 2 ' , x' 3 ' , . . . , x' m ) .) 

25 Let A be a real, symmetric, n x n matrix with characteristic polynomial 

det(x/„ - A) = x" — cjx"- 1 + c 2 x n - 2 + (-l)"c„. 

Show that 

(a) c\ = YTi=\ a u = tr (A), the trace of A. 

(b) C2 = I[tr(A) 2 -tr(A 2 )] 

(c) c 3 = i [tr(A) 3 - 3 tr(A) tr(A 2 ) + 2 tr(A 3 )] 

(d) tr(A') - ci t^A'- 1 ) + c 2 tr(A'- 2 ) + {-l)'tc t = Q,t>\. 

26 Recall that [A:, l m ] is shorthand for the partition of m + k consisting of a single 
k followed by m l's. 

(a) Show thatM s (x 1 ,x 2 , . . . ,x„)E t (x 1 ,x 2 , ...,x n )= M [s+lA ,-i ] (x u x 2 , . . . ,x„)+ 
^[s,l'](*l>*2, . . . ,x n ), s > 1. 

(b) Show that M 1 (x u x 2 , . . . ,x n )E t (x u x 2 , . . . ,x n ) = M [2A ,](x u x 2 , . . . ,x n )+ 
(t+ l)E t+ i(x u x 2 , . . . ,x n ). 

(c) Base a proof of Newton's identities on parts (a) and (b). 



*1.10. COMBINATORIAL ALGORITHMS 

In a few generations you can bread a racehorse. The recipe for making a man like 
Delacroix is less well known. 

— Jean Renoir 

Algos is the Greek word for "pain"; algor is Latin for "to be cold"; and Al Gore is 
a former Vice President of the United States. Having no relation to any of these, 
algorithm derives from the ninth-century Arab mathematician Mohammed ben 
Musa al-Khowarizmi. Translated into Latin in the twelfth century, his book 
Algorithmi de numero Indorum consists of step-by-step procedures, or recipes, 
for solving arithmetic problems. 

As an illustration of the role of algorithms in mathematics, consider the follow- 
ing example: one version of the well-ordering principle is that any nonempty set of 

Mohammed, son of Moses, of Khowarizm. Al-Khowarizmi also wrote Hisab al-jabr wa'l muqabalah; 
from which the word algebra is derived. It was largely through the influence of his books that the Hindu- 
Arabic numeration system reached medieval Europe. 
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positive intergers contains a least element. Given two positive integers a and b, well 
ordering implies the existence of a least element d of the set 

{sa + tb:s and t are integers and sa + tb > 0}. 

This least element has a name; it is the greatest common divisor (GCD) of a and b. 
Well ordering establishes the existence of d but furnishes little information about its 
value. For that we must look elsewhere. 

Among the algorithms for computing GCDs is one attributed to Euclid, based on 
the fact that if r is the remainder when a is divided by b, then the GCD of a and b is 
equal to the GCD of b and r. A different algorithm is based on the unique prime 
factorizations of a and b. Either algorithm works just fine for small numbers, where 
the second approach may even have a conceptual advantage. For actual computa- 
tions with large numbers, however, the Euclidean algorithm is much easier and 
much much faster. 

Not until digital computers began to implement algorithms in calculations invol- 
ving astronomically large numbers did the mathematical community, as a whole, 
pay much attention to these kinds of computational considerations. Courses in 
the analysis of algorithms are relatively new to the undergraduate curriculum. 

This section is devoted to a naive introduction to a few of the ideas associated 
with combinatorial algorithms. Let's begin with the multinomial coefficient 

\ri,r 2 ,...,r k J 
n\ 

r\\r 2 \ ■■■r k V 

where, e.g., 

n\ = 1 x 2 x • • • x n. 

Observe that n\ is not so much a number as an algorithm for computing a num- 
ber. To compute «!, multiply 1 by 2, multiply their product by 3, multiply that pro- 
duct by 4, and so on, stopping only when the previous product has been multiplied 
by n. 

The following is a subalgorithm, or subroutine, to compute the factorial F of an 
arbitrary integer X: 

1 . Input X. 

2. F=landJ=0. 

3. 1=1+1. 

4. F=F x J. 

5 . If I < X, then go to step 3 . 

6 . Return F . 
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These lines should be interpreted as a step-by-step recipe that, absent directions 
to the contrary (like "go to step 3"), is to be executed in numerical order. In step 6, 
the value returned is F = XI. 

This subroutine is written in the form of a primitive computer program. To a 
hypothetical computer, symbols like X, F, and / are names for memory locations. 
Step 1 should be interpreted as an instruction to wait for a number to be entered, 
then to store the number in some ("random" ) memory location and, so as not to 
forget the location, flag it with the symbol X. In step 2, the numbers 1 and 0 are 
stored in memory locations labeled F and /, respectively. In step 3, the number 
in memory location / is replaced with the next larger integer. ^ In step 4, the number 
in memory location F is replaced with the product of the number found there, and 
the number currently residing in memory location /. If, in step 5, memory location / 
contains X, operation moves on to step 6, where the subroutine terminates by 
returning F = X\. Otherwise, the action loops back to step 3 for another iteration. 

The loop in steps 3-5 can be expressed more compactly using the equivalent 
"For . . . Next" construction foud in steps 3-5 of the following: 

1.10.1 (Factorial Subroutine) Algorithm 

1 . Input X. 

2. F=l. 

3 . For 1= 1 to X. 

4 . F= F x J. 

5 . Next /. 

6 . Return F . □ 

The factorial subroutine affords the means to compute «!, ri!, r^}., and so on, 
from which the multinomial coefficient M = ( " ) can be obtained, either as 

V ri,r 2 ,...,r k 1 

the quotient of «! and the product of the factorials of the r's or, upon dividing n\ by 
r\\, dividing the quotient by r^ \ dividing that quotient by 7-3!, and so on. While these 
two approaches may be arithmetically equivalent, they represent different algorithms. 

1.10.2 (Multinomial Coefficient) Algorithm 

1. Input n, k, 1 1 , x 2l ■ •■/ ?k- 

2. X=n. 

3. Call Algorithm 1. 10. 1. 

4. M= F . 

5 . For j = 1 to 7c. 
6. X=ij 

*Hence the name random-access memory, or RAM. 

dotations such as "/<—/+ 1" or "/ :=/+!" are sometimes used in place of "/ = /+ 1". 
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7. Call Algorithm 1. 10. 1. 

8. M=M/F. 

9. Next j. 
10 . Return M. 



Having let X = n in step 2, the factorial subroutine is called upon in step 3 to 
return F = n\. Thus, in step 4, the number entered into memory location M is n\. On 
the first trip through the loop in steps 5-9, 7=1 and X = r\ . When the factorial 
subroutine is called in step 7, the number it returns is F = r\ ! so, in step 8, the num- 
ber in memory location M is replaced by n\jr\\. Assuming j < k in step 9, action is 
directed back to step 5, and the value of j is increased by 1. The second time step 8 
is encountered, the number currently being stored in memory location M, namely, 
n\jr\\, is replaced with (n! / Vi!) / V2! = n\j{r\\r-£). And so on. Finally, the Mi and 
last time step 8 is encountered, the number in memory location M is replaced with 
( » ). 

It might be valuable to pause here and give this algorithm a try, either by writing 
a computer program to implement it or by following the steps of Algorithm 1.10.2 
yourself as if you were a (virtual) computer. Test some small problem, the answer 
to which you already know, e.g., (442!) = 34, 650 from the original MISSISSIPPI 
problem. After convincing yourself that the algorithm works properly, try it on 
C(100,2). 

Whether your computer is virtual or real, using Algorithm 1.10.2 to compute 
C(100, 2) may cause it to choke. If this happens, the problem most likely involves 
the magnitude of 100!. The size of this number can be estimated by means of an 
approximation known as Stirling 's formula : 

n\ = V2™{-) ■ (I- 44 ) 

Using common logarithms, 100/e = 36.8 = 10 L57 , so (100/e) 100 = 10 157 . Since 
V2n x 10 = 25, Equation (1.44) yields 100! = 2.5 x 10 158 . (Current estimates 
put the age of the universe at something less than 5 x 10 26 nanoseconds.) 

Without a calculator or computer, one would not be likely even to consider eval- 
uating C(100,2) by first computing 100!, because something along the following 
lines is so much easier: 

y ' ' 98! x 1 x 2 
= 99 x 50 
= (100- 1) x 50 
= 4950. 



Stirling's formula should not be confused with Stirling's identity, soon to be encountered in Chapter 2. 
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The key to converting this easier approach into an algorithm is best illustrated with 
a slightly less trivial example, e.g., (see Theorem 1.5.1) 

n \ = P(n,r) ^ P(n-r,s) P{n-r- Sl t) 
r,s,t J r\ s\ t\ 1 ' ' 

Viewing P(n, r) jr\ as 

n x in — 1) x • • • x in — r + 1) n n — l n — r+1 

= - X X • • • X , 

1 x 2 x • ■ • x r 1 2 r 

P(n — r, s)/s\ as 

n — r n — r — 1 n — r — s + 1 

— X ^2— x -" x — ; — ■ 

and so on, suggests another subroutine: 
1. M=l. 

2 . For J= 1 to r. 

3 . M=M x N/J. 
4. N=N-1. 

5 . Next J. 

Setting N = n and r = rj and nesting this subroutine inside a "For / = 1 to k" loop 
yields another algorithm. 

Can we do better? Almost surely. Because n = r + s + t, the last factor in Equa- 
tion (1.45) is P(t, t)/t\ = t\/t\=\. Evidently, "For / = 1 to k - 1" suffices in the 
"outside loop". On the other hand, since ( r] r " ) = ( ^ " n ) , the outside loop 
could just as well be "For / = 2 to k" . 



1.10.3 (Improved Multinomial Coefficient) Algorithm 

1. Input n, k, r lf r 2 , r k . 

2. M= 1 and N=n. 
3 . For 1= 2 to k. 
4. ForJ=ltor T . 
5 . M=Mx N/J. 

6. N=N-1. 
7 . Next J. 

8 . Next I. 

9. Return Af. □ 
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LUCK LUKC 

ULCK ULKC 

CLUK CLKU 

KLUC KLCU 



LCUK LCKU 

UCLK UCKL 

CULK CUKL 

KULC KUCL 



LKUC LKCU 

UKLC UKCL 

CKLU CKUL 

KCLU KCUL 



Figure 1.10.1. Rearrangements of LUCK. 



It is clear from our experience so far that different algorithms can achieve the 
same outcome, some better than othersl Algorithm 1.10.3 is superior to Algorithm 
1.10.2 because it is more widely applicable. (Check to see that calculating 
C(100, 2) is no trouble for Algorithm 1.10.3.) In general, however, it is not always 
clear which of two (or more) algorithms is best. It may not even be clear how to 
interpret "best"! 

This book began with a discussion of the four-letter words that can be produced 
by rearranging the letters in LUCK. An initial (brute-force) approach resulted in a 
systematic list, reproduced in Fig. 1.10.1 for easy reference. In subsequent discus- 
sions, it was often useful to imagine constructing a list, with the implied under- 
standing that list making is mildly distasteful. And, so it is, as long as the only 
reason to make a list is to count the words on it! Such peremptory judgments do 
not apply when the list serves other purposes. There are, in fact, many good reasons 
to make a list. 

Suppose one had a reason for wanting a list of the 4! = 24 rearrangements of 
LUCK, e.g., to use in constructing a master list of encryption keys upon which 
to base monthly corporate passwords for the next two years. In order to be most 
useful, such a list should be organized so that specific words are easy to locate. Fig- 
ure 1.10.1 gives one possibility, based on the order in which the letters appear in 
LUCK. A more common approach is based on the order in which letters appear in 
the alphabet. 

7.70.4 Definition. Let X and Y = y\yi . . . y q be words containing p 

and q letters, respectively. Then X comes before Y, in dictionary order, if x\ comes 
before y\ in alphabetical order; or if there is a positive integer r < p such that 
x i = yi, 1 < i < r > an d x r precedes y r in alphabetical order; or if p < q and 

Xi = y t , 1 < i < p. 

A list of words in dictionary order is often called an alphabetized list, and dic- 
tionary order is sometimes referred to as "alphabetical order." Whatever such lists 
are called, algorithms to generate them are surprisingly difficult to design. Our 
approach takes advantage of the numerical order that is already hard-wired into 
computers. 



* Dictionary order is also known as lexicographic order, lexicon being another word for "dictionary" 
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7.70.5 Example. Consider "words" assembled from the alphabet {0, 1 , 2, . . . , 9}. 
Suppose alphabetical order for these ten "letters" is interpreted as numerical order. 
Would it surprise you to learn that, in this context, dictionary order does not coin- 
cide with the usual extension of numerical order? While 9 comes before 10 in 
numerical order, 9 comes after 10 in dictionary order! (Confirm that, upon restric- 
tion to number/words of the same length, the two orderings do coincide.) □ 

1.10.6 Example. In the spirit of Example 1.10.5, consider the 4! = 24 four- 
letter words that can be assembled by rearranging the letters/digits in 3142. Among 
the challenges that stand between us and an algorithm to generate and list these 
words in dictionary order is familiarity! We do chores like this all the time without 
thinking about how we do them. 

Let's start at the beginning, focusing on process: Since 1 comes first in alphabe- 
tical order, any word that begins with 1 will precede, in dictionary order, all words 
that begin with something else. Similarly, among the words whose first letter is 1, 
any whose second letter is 2 will precede all those whose second letter is not. Con- 
tinuing in this way, it is easy to see that the list must begin with 1234, the unique 
rearrangement of 3142 in which the letters occur in increasing alphabetical order. 
Reversing the argument shows that the last word on the list is 4321, the unique word 
in which the letters decrease, in alphabetical order (when read from left to right). 

Because only two rearrangements of 3142 have initial fragment 12, the word fol- 
lowing 1234 on the list can only be 1243. Indeed, any two words with the same 
initial fragment have tailing fragments consisting of the same (complementary) let- 
ters. Moreover, all words with the same initial fragment must appear consecutively 
on the list, starting with the word in which the tailing letters are arranged in increas- 
ing order and ending with the word in which the tailing letters are in decreasing 
order. 

After 1243 come the words with initial fragment 13. In the first of these, the tail 
is 24, and in the second it is 42. The observation that 42 is the reverse of 24 suggests 
a two-step procedure for finding the next word after 1342 on the list. 

In the first step, 1342 is transformed into the intermediate word 1432 by switch- 
ing the positions of 3 and 4. Observe that, while the switch changes the tail from 42 
to 32, the new tail is (still) in decreasing order. In the second step, this intermediate 
word is transformed from last to first among the words with initial fragment 14 by 
reversing its tail. The result, 1423, is the next rearrangement of 3142 after 1342. 

What comes after 1423? Well, 1432, of course! But, how does 1432 emerge from 
the two-step process outlined in the previous paragraph? Because 1423 is the only 
word on the list that begins with 142, it is the last word on the list with initial frag- 
ment 142. (This time, the tail is 3.) Switching 2 and 3 results in the intermediate 
word 1432 (whose tail is 2). Because a tail of length one reverses to itself, the 
output of the two-step process is 1432. 

What comes after 1432? Because 432 is in decreasing order, 1432 is the last 
word on the list with initial fragment 1 . Switching 1 with 2 produces the intermedi- 
ate word 2431. Reversing the tail, 431, yields the next word on the list, namely, 
2134. 
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Imagine yourself somewhere in the middle of the list, having just written the 
word d\d.2d-id\. Using the two-step process to find the next word depends on being 
able to recognize the letter to be switched. The key to doing that is the tail. Assum- 
ing did 2 did4 ^ 4321, the only way it can be the last word on the list with initial 
fragment d\ . . . dj is if letters dj + \ , . . . , d\ are in decreasing order. For dj to be the 
letter that gets switched, there must be some letter in the tail with which to switch it, 
i.e., some d k G {dj+\, • • • , <£t} that comes after dj in alphabetical (numerical) order. 
If dj,dj + i, . . . ,d4 were in decreasing order, there could be no such d k . 

In the two-step process, the tail is the longest fragment (starting from the right- 
hand end of d\didT,dn) whose letters decrease (when read from left to right). Put 
another way, the letter to be switched is dj, where j is the largest value of i such 
that di < d i+ \. Once j has been identified, step 1 is accomplished by switching dj 
with d k , where d k is the smallest letter in the tail that is larger than dj, i.e., 

d k = min{di : i > j and dj > dj}. (1-46) 

(Because dj+i > dj and because d j+l belongs to the tail, d k always exists.) 

When dj and d k are switched, a new tail is produced in which d k (from the old 
tail) has been replaced by dj. Because of the way j and d k have been chosen, the 
letters in the new tail are (still) decreasing. Reversing the new tail in step 2 is 
equivalent to rearranging its letters into increasing order. □ 

The discussion in Example 1.10.6 leads to an algorithm for listing, in dictionary 
order, all rearrangements of 3142. 

1.10.7 Algorithm 

1. Set d±= i, 1< i < 4. 

2 . Wr it e d 2 d 2 d 3 d 4 . 

3 . If di > di+i , 1 < i < 3 , then stop. 

4. Let j be the largest i such that d± < d± +1 . 

5. Let khe chosen to satisfy Equation (1.46) . 

6. Switch dj and d k . * 

7. Reverse dj + i, d 4 . 

8 . Go to step 2 . □ 

It would not be a bad idea to pause and implement Algorithm 1.10.7 on a 
computer (real or virtual) and check to see that the output is something closely 
resembling Fig. 1.10.2. 

What about the master list of encryption keys upon which to base monthly 
corporate passwords for the next two years? An algorithm to generate a list, in 



*So that the new dj is the old d k , and vice versa. 
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1234 


1243 


1324 


1342 


1423 


1432 


2134 


2143 


2314 


2341 


2413 


2431 


3124 


3142 


3214 


3241 


3412 


3421 


4123 


4132 


4213 


4231 


4312 


4321 



Figure 1.10.2. The 24 rearrangements of 1234. 



dictionary order, of all 24 rearrangements of LUCK, is only a step or two from 
Algorithm 1.10.7. The missing steps involve explaining to a computer that C, K, 
L, U is an alphabetical listing of the letters in LUCK. This is most easily accom- 
plished using "string variables". 

Like a word, a text string is a sequence (ordered concatenation) of symbols. Like 
numbers, strings of text can be stored in memory locations and labeled with sym- 
bols. But, it is often necessary to choose labels that distinguish string memory loca- 
tions from those used to store numbers. We will use a dollar sign to indicate a string 
variable. The notation A$(4) = "FOOD", e.g., indicates that the string FOOD 
should be stored in the fourth cell of an array of string variable memory locations 
labeled A$. 

1.10.8 Example. To convert Algorithm 1.10.7 to an algorithm for generating, in 
dictionary order, the rearrangements of LUCK, add step 

0 . L$ ( 1 ) = "C" , L$ ( 2 ) = "K" , L$ ( 3 ) = "L" , L$ ( 4 ) = "U" 

and modify step 2 so that it reads 

2. Write L$ {d 2 )L$ (d 2 )L$ ( d 3 ) L$ ( d 4 ) . □ 

Why not pause, modify Algorithm 1.10.7 now, and confirm that its output resem- 
bles Fig. 1.10.3. (Compare with Fig. 1.10.1.) 

1.10.9 Example. The conversion of Algorithm 1.10.7 in Example 1.10.8 was 
relatively easy because the letters L, U, C, and K are all different. How much harder 
would it be to design an algorithm to generate, in dictionary order, all 4!/2 = 12 
four-letter rearrangements of LOOK? 



CKLU 


CKUL 


CLKU 


CLUK 


CUKL 


CULK 


KCLU 


KCUL 


KLCU 


KLUC 


KUCL 


KULC 


LCKU 


LCUK 


LKCU 


LKUC 


LUCK 


LUKC 


UCKL 


UCLK 


UKCL 


UKLC 


ULCK 


ULKC 



Figure 1.10.3. Rearrangements of LUCK in dictionary order. 



*As the name digital computer suggests, these machines were conceived and designed to crunch numbers. 
Numerical order is programmed into their genes, so to speak. Tasks related to word processing, on the 
other hand, have to be "learned", or "memorized" (which is why word processing software takes up so 
much space on a hard drive). 
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Let's begin with an algorithm to produce, in dictionary order, all twelve rearran- 
gements of 1233. This is surprisingly easy! It can be done by replacing step 1 in 
Algorithm 1.10.7 with 

1 . Set di= 1 , d 2 = 2 , d 3 = 3 , and d 4 = 3 

and replacing "<" in step 4 with "<". 

To generate an ordered list of the rearrangements of LOOK, it suffices to modify 
this modified algorithm in the same way that Algorithm 1.10.7 was modified to 
obtain Example 1.10.8, namely, by adding step 

0 . L$ ( 1 ) = "K" , L$ ( 2 ) = "L" , L$ ( 3 ) = "O" 

and changing step 2 to 

2 . Wr it e L$ ( d ± ) L$ ( d 2 ) L$ ( d 3 ) L$ ( d 4 ) . 

At this point, how hard can it be to write an algorithm for listing, in dictionary 
order, all 11 -letter words that can be produced by rearranging the letters in 
MISSISSIPPI? □ 

It is one thing to generate and list, in dictionary order, all possible rearrange- 
ments of the letters in some arbitrary word. It is something else to rearrange 
some arbitrary list of words into dictionary order. The latter is a so-called sorting 
problem. The comparison of various sorting algorithms affords a natural introduc- 
tion to some applications of combinatorics in the analysis of algorithms. Those 
interested in pursuing such a discussion are referred to Appendix A2. 

1.10.10 Example. A systematic listing of the seven partitions of 5 might be 
expected to look like this: 

[5], [4, 1], [3,2], [3, 1,1], [2,2, 1], [2, 1, 1, 1], [1, 1, 1, 1, 1]. 

In reverse dictionary order, a = [oci, oc 2 , . . . , oc^] h n comes before P=[p!,P 2 , 
. . . , (3J h n if (and only if) oci > p\ or there is an integer t < £ such that a,- = p\, 
1 < i < t, and oc (+ i > p\ +1 . Let's see if we can devise an algorithm to generate and 
list, in reverse dictionary order, all p{n) partitions of n. 

Because the list begins with [«], all that's required is a step-by-step procedure 
to find the next partition, in reverse dictionary order, after a fixed but 
arbitrary a = [oti, oc 2 , . . . , <Xe] I 1 "] ( me l ast partition on the list). There are two 
cases. 

Case 1: If ae = 1, then a = [oti, a 2 , . . . , 0^, 1, . . . , 1], where 1 occurs with mul- 
tiplicity m, a k > 1, and £ = k + m. If \i is the next partition after a, then u is the first 
partition, in reverse dictionary order, that satisfies the conditions p., = a,-, 1 < i < k, 
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and \i k = ocfc — 1. To find \i, let S = a k + m, the sum of the parts of a coming 
after an. If q is the quotient and r the remainder, when S is divided by 
d = at — 1, then 

H = [ai,a 2 , • • . ,a*_i,a* - 1, . . . , a* - 1, r], 

where a^ — 1 occurs with multiplicity q and it is understood that r does not appear if 
it is zero. 

Case 2: If a e > 1, the next partition after a is 

|a, = [ai,a 2 , ...,ae-i,<x e - 1,1]. □ 

Let's design an algorithm to implement the ideas of Example 1.10.10. 
Suppose 

ct= [n m{n \... 1 2 m{2 \\ m( % 

where i m ('> is understood not to appear when m(i) = 0. If w(l) = n, then a = [1"] 
and the list is complete. Otherwise, let j be the smallest integer larger than 1 such 
that m(j) > 0. The steps used in Example 1.10.10 to produce |x, the next partition 
after a, are these. Replace m(j) with m(j) — 1. In case 1 (the case in which 
m(l) > 0), let q and r be the quotient and remainder when S =j + m(l) is divided 
by d=j—\. Set m(l) = 0; then set m(j — 1) = q and, if r > 0, set m(r) = 1. 
In case 2, if j = 2, set m(l) = 2; otherwise, set m(j — 1) = 1 and m(l) = 1. A 
formal algorithm might look like this: 

I. 10.11 (Partition Generating) Algorithm 

1 . Input n. 

2. Set m ( i) = 0 , l<i<n, and m (n) = 1 . 

3. Write [n m(n) , ... ,2 m(2) , l m(1) ] . 

4. If m(l)= n, then stop. 
5 . S = m(l) . 

6. m (1) =0. 

7. j = l. 

8. j = j + l. 

9. If m( j ) = 0 , then go to step 8 . 
10. D= j - 1. 

II. m(j) =m( j) - 1. 

12. If S=0, then go to step 19. 

13. S=S+j. 

14. 2=LV£J- 

15 . R= S — D x g. 

16. m(D) =£>. 

17. If i?> 0, then n?(i?) =1. 
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18 . Go to step 3 . 

19. If j = 2 , then go to step 23 . 

20. m(D) =1. 

21. in ( 1) = 1. 

22 . Go to step 3 . 
23. m(l) =2. 

24 . Go to step 3 . □ 

Note that case 1 is addressed in steps 13-18 of Algorithm 1.10.11, while case 2 
is handled in steps 19-24. 

Having endured the development of Algorithm 1.10.11, why not convert it to a 
computer program and have the satisfaction of seeing the partitions of n appear on a 
computer screen? 



1.10. EXERCISES 

1 Write an algorithm to list the integers 1-100 in numerical order. 

2 Write an algorithm to input two numbers and output 

(a) their product. 

(b) their sum. 

(c) their difference. 

3 Assuming that r\, r-i, ■ ■ . , vary in size, which of them should be chosen to 
play the role of r\ in Algorithm 1.10.3? 

4 Without actually running any programs, describe the output that would be 
produced if step 0 in Example 1.10.8 were replaced with 

0. L$ (1) ="K", L$(2) ="L", L$(3) = "0",L$(4) ="0". 

5 Write an algorithm to generate and list, in dictionary order, 

(a) all 5! = 120 rearrangements of LUCKY. 

(b) all 4!/2 = 12 rearrangements of COOL. 

6 Write an algorithm to compute and output the first ten rows (as n goes from 0 
to 9) of Pascal's triangle. Base your algorithm on 

(a) the algebraic formula C(n, r) = n\/[r\{n - r)!]. 

(b) Pascal's relation. 

7 Write an algorithm to generate and list, in dictionary order, all rearrangements 
of 

(a) BANANA. (b) MISSISSIPPI. (c) MATHEMATICS. 
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8 Write an algorithm to generate and output the first ten rows of the partition 
triangle (i.e., the array whose («, m)-entry is p m (n), the number of m-part 
partitions of n). 

9 Write an algorithm to input n and output p(n), the number of partitions of n. 
Base your algorithm on 

(a) your solution to Exercise 8. 

(b) Algorithm 1.10.11. 

10 Write an algorithm to input ao-at, and bo-b^ and to output the coefficient of x k , 
7 > k > 0, in the product 



(a Q x 4 + aix 3 H h a 4 )(b Q x 3 + b x x 2 H h b 3 ). 

11 Write an algorithm to input xi-Xf, and to output 

(a) the third elementary symmetric function, E 3 (x l ,x 2 , . . . ,x 6 ). 

(b) all C(6,3) three-element subsets of {1,2,3,4,5,6}. 

(c) all C(6, 3) three-element subsets of {x\,x 2 , . . . ,x 6 }. 

12 Write an algorithm to input x t -x 6 and to output 

(a) E 2 (x u x 2 , ■ ■ ■ ,x 6 ). 

(b) all C(6,2) two-element subsets of {x 1} x 2 , . . . ,x^}. 

(c) the complements of the subsets in part (b). 

(d) E 4 (x u x 2 , . . . ,x 6 ). 

13 Write an algorithm to input six positive numbers x\-x^ and to output 

E 5 ( Xl ,x 2 ,...,x 6 ). 

14 Write an algorithm to input the parts of a partition and output the parts of its 
conjugate. 

15 Assuming 0 comes before 1 in alphabetical order, write an algorithm to 
generate and output, in dictionary order, 

(a) all binary words of length 4 (i.e., all four-letter words that can be 
assembled using the alphabet {0, 1}). 

(b) all binary words of length 8 and weight 4, where the weight of a binary 
word is the number of l's among its bits. 

16 Write an algorithm to input n and output, in dictionary order, all binary words 
of length n. {Hint: Exercise 15(a).) 

17 The problem in Exercise 16 is to generate and list binary words in dictionary 
order. Here, the problem is to generate and list binary words in a different 
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order, one in which adjacent words differ in a single bit.* Because the kth word 
differs from its immediate predecessor in a single bit, to solve this problem it 
suffices to identify that bit. Here is a procedure for doing that: Every bit of the 
first word is zero. For 1 < k < 2", the kth word is obtained from its 
predecessor by changing the dth bit, where d — 1 is the highest power of 2 
that exactly divides k — 1 . 

(a) List the 16 binary words of length 4 in the order prescribed by this 
procedure. (Hint: As you go along, check to be sure that each newly listed 
word is different from all of its predecessors, and that it differs from its 
immediate predecessor in a single bit.) 

(b) Show that word k differs from word 2" - k + 1 in a single bit, 1 < k < 2". 

(c) Show that the procedure described in this exercise generates 2" different 
binary words of length n. 

(d) Write an algorithm to implement the procedure described in the 
introduction to this exercise. 

(e) Write an algorithm to list the 2" subsets of { 1 , 2, ...,«} in such a way that 
any two adjacent subsets on the list differ by just one element. 

18 Assuming the keyword RND returns a pseudorandom + number from the 
interval (0, 1), the following subroutine will generate 1000 pseudorandom 
integers from the interval [0, 9]: 

1. For 1= 1 to 1000. 

2 . R( I) = [10 x RNDJ . 

3 . Next I. 

To the extent that RND simulates a true random-number generator, each 
integer in [0, 9] ought to occur with equal likelihood. Each time the subroutine 
is implemented, one would expect the number 9, e.g., to occur about 100 
times. 

(a) Write a computer program based on (an appropriate modification of) the 
subroutine to generate and output 50 pseudorandom integers between 0 
and 9 (inclusive). 

(b) Run your program from part (a) ten times (using ten different randomizing 
"seeds") and record the number of 9's that are produced in each run. 

(c) Modify your program from part (a) to generate and print out 500 
pseudorandom integers between 0 and 9 (inclusive) and, at the end, to 
output the number of 9's that were printed. 



A list in which each entry differs as little as possible from its predecessor is commonly called a "Gray 
code". Because such lists have nothing to do with binary codes, "Gray list" might be a better name for 
them. 

+ An algorithm to generate random numbers is something of an oxymoron. Truly random numbers are 
surprisingly difficult to obtain. 
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19 Assuming keyword RND returns a pseudorandom number, here is an algo- 
rithm to simulate the flipping of a single fair coin: 

1. X=RND. 

2. If X< 1/2 , then write "H". 

3. If X> 1/2 , then write "T". 

(a) Write an algorithm to output 100 simulated flips of a fair coin. 

(b) If you were to run a computer program that implements your algorithm 
from part (a), how many H's would you expect to see? 

(c) Write a computer program to implement your algorithm from part (a), run 
it ten times (with ten different randomizing "seeds"), and record the total 
number of H's produced on each run. 

(d) Write an algorithm to output 100 simulated flips of a fair coin and, at the 
end, output the total numbers of heads and tails. 

(e) Write an algorithm to output 100 simulated flips of a fair coin and, at the 
end, output the (empirical) probability of heads. 

20 If a fair coin is flipped 100 times, it would not be unusual to see a string of four 
or five heads in a row. 

(a) Run your program from Exercise 19(c) ten times (using ten different 
randomizing "seeds") and record the longest string of consecutive H's 
and the longest string of consecutive 7"s for each run. 

(b) Modify your algorithm/program from Exercise 19(a)/(c) so that it outputs 
the length of a longest string of consecutive H's and of a longest string of 
consecutive T's. 

21 Suppose 12 fair coins are tossed into the air at once. 

(a) Compute the probability of six heads and six tails. 

(b) Write an algorithm to simulate 100 trials of tossing a dozen coins and to 
output the empirical probability that half the coins come up heads and half 
tails. (See the discussion of the keyword RND in the introduction to 
Exercise 18.) 

22 Write an algorithm to simulate 100 flips of a biased coin, one in which heads 
occurs a third of the time. (Hint: See the introduction to Exercise 19.) 

23 Write an algorithm to simulate 100 rolls of a fair die. (See the introduction to 
Exercise 18 for an explanation of the keyword RND.) 

24 Assuming keyword RND returns a pseudorandom number, write an algorithm 
to simulate 1200 trials of rolling two (fair) dice 

(a) and output the results. 

(b) and output the empirical probability of rolling a (total of) 7. 
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25 Assuming keyword RND returns a pseudorandom number, write an algorithm 
to simulate 1200 trials of rolling a single (fair) dodecahedral die, and 
to output the results and the empirical probability of rolling a 7. (Hint: A 
dodecahedral die has twelve faces numbered 1-12.) 

26 Assuming keyword RND returns a pseudorandom number, write an algorithm 
to simulate 1200 trials of rolling five (fair) dodecahedral dice and output the 
empirical probability of rolling a (sum of) 30. 
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The Combinatorics of 
Finite Functions 

Choose if you dare. 
— Pierre Corneille (Heraclius, Act IV, Scene iv) 

In Chapter 2, we enter the second stratum of combinatorics. The material here is 
deeper, in the sense that the objects of study are functions. Functions of finite 
sets have a very different flavor from the kinds of functions one sees, e.g., in 
calculus or linear algebra. Ironically, it is probably the simplicity of these functions 
that make them feel so unfamiliar. On the other hand, there is a good deal of back- 
and-forth interplay with the material of Chapter 1. Stirling's triangles, for example, 
have much in common with the better known triangle of Pascal. 

Partitions of positive integers were introduced in Section 1.8. The different notion 
of partitions of finite sets arises in Section 2. 1 in the context of counting onto functions. 
Properties and applications of Stirling numbers of the second kind are the theme of 
Section 2.2, where an unexpected connection with sums of powers of positive inte- 
gers is revealed. Together with the tools of Chapter 1, Stirling numbers give us the 
means to solve a class of problems historically stated in terms of balls and urns. 

Introduced in the context of fixed points, the famous principle of inclusion and 
exclusion is the topic of Section 2.3, and Section 2.4 involves cycle structure and 
Stirling numbers of the first kind. In the final section, Stirling numbers of the first 
kind are expressed in terms of the elementary numbers (elementary symmetric 
functions) of Section 1.9. Section 2.5 concludes with a remarkable connection 
between the two kinds of Stirling numbers. 

2.1. STIRLING NUMBERS OF THE SECOND KIND 

It is easy for any former calculus student to come up with lots of examples of func- 
tions, e.g., f(x) = x 2 , f(x) = sin(x), or/(x) = ln(x). In Chapter 1, we discussed 
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some functions that could easily have come from a course in multivariable calculus, 
e.g., E 2 (x, y, z) = xy + xz + yz. 

Strictly speaking, a function is comprised of three parts, a domain D, a range R, 
and a "rule of assignment" / that associates to each x G D a unique of element 
f(x) G R. In single variable calculus, R is typically the set of real numbers and D 
is the largest of its subsets for which the rule of assignment makes sense. If 
f(x) = ln(x), then D = (0, oo). If f(x) — l/x,D is the set of nonzero real numbers. 

In these familiar examples, both D and R are infinite sets. The most practical 
way to describe a rule of assignment in these circumstances is by means of a 
formula. Implicit in the formula f(x) = x 2 is an algorithm for evaluating f(x). 
Computing /(3) is trivial. On the other hand, no comparable algorithm is implicit 
in/(x) = ln(x). Because it is no more than a name for the mysterious power of e 
that it takes to produce 3, computing ln(3) is anything but trivial. 

The good news about functions of finite sets is that, at least in principle, there is 
no need for formulas or algorithms. Rules of assignment can be given by means of 
lists or sequences. Suppose, e.g., that D = {1, 2, 3,4} and R= {1,2, 3, 4, 5}. Then 

/(1)=2, /(2) = 1, /(3) = 2, /(4) = 5 (2.1) 

completes the description of a unique function. Instead of a formula like f(x) — 
x 2 — 4x + 5, this function can be expressed as / = (2, 1,2, 5). 

2.7.7 Example. Suppose D = {1,2,3,4} and R = {1,2,3,4,5}. What function 
is given by the rule g = (5,3, 1,3)? In sequence notation, g(i) is listed in the i'th 
place. So, 

*(1) = 5, g(2) = 3, g(3) = l, 8(4) =3. 

What about (4, 1, 5, 3, 3)? This sequence does not correspond to any function of 
D = {1,2,3,4}. Its length is wrong. The functions on D= {1,2,3,4} are 
represented by sequences of length 4. Similarly, h — (6, 2, 3, 1) could not possibly 
be a function from D into R = {1,2,3,4,5} because h(\) = 6 is not an element 
of R. □ 

2.7.2 Definition. Denote by F m „ the set of all functions from D = {1,2, ... ,m} 
into R — {1,2, . . . ,«}. The notation for / G F m , n is 

/=(/(l),/(2),...,/(m)). 

2.1.3 Example. The set of all possible functions from {1,2,3} into {1,2} is 
F 3i2 = {(1,1,1), (1,1, 2), (1,2,1), (1,2,2), (2, 1,1), (2, 1,2), (2, 2,1), (2, 2,2)}. 



*The image of/, {/(*) : x e D}, is a subset of R. 
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Similarly, 

F 2 , 3 = {(1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3)}. D 

In Example 2.1.3, the elements of F 3j2 and F 2j 3 were listed in so-called 
dictionary order. 

2.7.4 Definition. Suppose /, gGF m „, f ^ g. Let i be the smallest positive 
integer such that f(i) ^ g(i). If f(i) < g(i), then / comes before g in dictionary 
order, and we write / < g. 

2.1.5 Example. If / = (2, 2, 1) and g = (2, 1,2), then /(l) = 2 = g(l), but 
/(2) = 2 > 1 = g(2). So, / comes after g in dictionary order, i.e., / > g. [In 
Example 2.1.3, / = (2,2, 1) comes immediately after g = (2, 1,2).] 

The smallest positive integer i such that/(i) 7^ g(i) corresponds to the first place 
in their respective sequences that/ and g differ. In this case, i = 2. In particular, it is 
irrelevant which of /(3) and g(3) is larger. □ 

Thinking of F3 2 as the set of all functions from {1, 2, 3} into {1,2} may take 
some getting used to. For one thing, F3.2 is finite. To count the function in F mn , 
observe that there are n choices for each of /(l), /(2), . . . ,f(m). Therefore, 
o(F m , n ) = n m . 

2.1.6 Example. Is 0(^2,3) equal to 8 or 9? Note that m and n are read first m then 
n in F m ,„ but first n then m in n m . In particular, o(F 2i3 ) = 3 2 . (Is it obvious, just by 
glancing at F 2j 3 and F 3j2 in Example 2.1.3, that F 2j 3 is the larger set?) □ 

Recall that / is one-to-one if and only if f{x\) =/(x 2 ) implies x\ = # 2 . 
Represented by sequences without repetitions, the one-to-one functions in F m „ 
are easy to count. There are n choices for/(l), n — 1 choices for/(2), . . . , and 
n— (m— I) = n — m+1 choices for f(m). The product of these numbers is 
P(n, m) = n(n — 1) • • • (n — m + 1). Again, there is a reversal of m and n. In "the 
one-to-one functions in F m „", m is read before n. In "P(n,m)", it's the other way 
around. 

Among the 3 2 = 9 functions in F2.3, ^(3,2) = 3x2 = 6 are one-to-one. [The 
three remaining functions are (1, 1), (2, 2), and (3, 3).] No function in F3 2 is 
one-to-one. If m > n, then P(«, m) = 0. 

Among the one-to-one functions are the increasing functions. 

2.1.7 Definition. Denote by Q mM C F m> „ the set of (strictly) increasing func- 
tions, i.e., / G Q m , n if (and only if) 1 </(!) </(2) < • • • <f(m) < n. 



* Dictionary order is also known as lexicographic order. 
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2.1.8 Example. In dictionary order, g 2 ,3 = {(1,2), (1,3), (2,3)}, Q 3 ,3 = 
{(1,2,3)}, and Q 3 , 5 = {(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5), (1, 4, 5), 
(2,3,4), (2,3,5), (2,4,5), (3,4,5)}. □ 

To count the functions in Q m .„, observe that an increasing sequence is uniquely 
determined by the integers that it contains. Once they have been chosen, there is 
just one way to arrange them into increasing order. Therefore, o(Q mn ) = 
C(n,m). (Note the "reversal" of m and n.) That o(g 2 , 3 ) = C(3,2) = 3, o(Q 3t3 ) = 
C(3,3) — 1, and 0(23,5) = C(5, 3) = 10 can be confirmed by glancing at 
Example 2.1.8. 

There is a curious irony about the identity o(Q„ h „) = C(n,m). While the ele- 
ments of <2 m „ are ordered sequences, order doesn't matter in their enumeration. 
(Recall the similar semantic difficulty in connection with arranging the parts of a 
partition from largest to smallest.) 

One application of Q m „ is an explicit formula for elementary symmetric functions. 

2.1.9 Theorem. If n is a fixed positive integer, then 

m 

E m (x\,x 2 , ...,x n )= y^J] */(,), l<m<n. (2.2) 

Proof. Recall that E m (x\ 7 x 2 , . . . ,x n ) is the sum of all products of the x's taken m 
at a time. Equation (2.2) is obtained by observing that each selection of m variables 
corresponds to a unique function / G Q m .„- ■ 

2.1.10 Example. Let's use Equation (2.2) to evaluate E 2 (xi,x 2 ,x 3 ). From 
Example 2.1.8, £2,3 = {(1,2), (1,3), (2,3)}. Iff = (1,2), then ]Jx m = Xl x 2 ; if 
f — (1,3), then = x i x 3> an d if/= (2,3), then Il x /(;) — X 2 X 3- The sum of 
these products is x\x 2 + x x x 3 +x 2 x 3 = E 2 (x l ,x 2 ,X2). □ 

One interesting thing about Equation (2.2) is the way it blends two very different 
species of function. Elementary symmetric functions are fairly sophisticated 
polynomials in several variables. It makes sense, e.g., to say things like "the partial 
derivative of E m (xi,x 2 , ■ ■ ■ ,x n ) with respect to the variable x„ is £,„_i 
(x\,x 2 ,... ,x n -i)." On the other hand, it makes no sense at all to talk about the 
derivative of some finite function / <E Q m ,n- 

Recall that/ : D — > R is onto if and only if {/(*) : x e D} — R. If m < n, then 
a sequence of length m cannot contain all the integers in {1, 2, . . . , «}. So, m > n is 
a necessary condition for there to exist any onto functions in F m n . Okay, assuming 
m> n, how many of the n m functions in F m „ are onto? This problem is not so easily 
solved as its one-to-one counterpart. The solution begins with the following. 

2.1.11 Definition. If y G {1,2, . . . ,«}and/ G F m> „, then/^ 1 (y) = {x :f(x)=y}. 

A potentially troublesome feature of Definition 2.1.11 is its abuse of the usual 
language. Recall that/ : D — > R has an inverse, / _1 : R — > D, if and only if / is 
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one-to-one and onto, in which case f~ 1 (y) is the unique x G D such that/(x) = y. If 
/ is not one-to-one, there may be more than one such x, and that is what Definition 
2.1.11 seeks to capture: f~ 1 (y) is the set of all suchx's. (Note that / is onto if and 
only if/ _1 (y) is nonempty for all y G R.) 

Iff is one-to-one and onto, and if/(x) = y, then the notation of Definition 2.1.11 
yields f~ l (y) = {x} rather than/ _1 (y) = x, which may cause some confusion. Iff 
is not one-to-one and onto, there should be no confusion. When the ordinary inverse 
does not exist, f~ x can be interpreted in only one way, namely, the one given by 
Definition 2.1.11. 

Finally, the variables needn't be called x or y. Integer variables commonly have 
names like i,j, and k. Iff G F m> „, then, e.g.,/ _1 (y) is the subset of {1, 2, . . ., m} 
consisting of all those integers i such that /(;') = j. 



2.1.12 Example. If/= (2, 1,2,5) G F 4 , 5 , then/- 1 ^) = {2},/" 1 (2) = {1,3}, 
/-'(3) = 0 =/" 1 (4), and/- 1 (5) = {4}. If g = (7,4,2,8,3) G F m>n , then m = 5 
and n > 8. Because g is one-to-one, o(g~ l (J)) < 1, l<y<n. Since, e.g., 



2.1.13 Lemma. Suppose f G F mj „. 77zew / is one-to-one if and only if 
°(f~ l (j)) < 1» 1 < j < ««<i/ is owfo i/anti oniy ifo(f~ 1 (j)) > 1, 1 <j <n. 



Among the topics discussed in Chapter 1 are partitions of the positive integer n. 
We are about to abuse the language again by using the word "partition" in a 
different way. 

2.1.14 Definition. Let S be a set. A partition of S is an unordered collection of 
pairwise disjoint, nonempty subsets of S whose union is all of S. The subsets of a 
partition are called blocks. 

For S = A\ U A2 U • • • U At to be a partition of S, two things are required: (1) 
A,- (~)Aj = 0 whenever i ^ j and (2) A,- ^ 0, 1 < 7 < k. 

2.1.15 Example. Two partitions are equal if and only if they have the same 
blocks. So. e.g., {1} U {2,3}, {1} U {3,2}, and {2,3} U {1} are three different- 
looking ways to write the same two-block partition of S= {1,2,3}. The other 
partitions of S are {1}U{2}U{3}, having three blocks; {1,2} U {3} and 
{1,3} U {2}, each having two blocks; and {1,2,3}, having just one block. In 
particular, S has a total of five different partitions. □ 

What do partitions and onto functions have in common? Suppose / G F m>n . Let 
D = {1, 2, . . . , m}. Because it is the domain of /, 



o(g '(5)) = 0, g is not onto. 



□ 



Proof. Immediate from the definitions. 




(2.3) 



j= 
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If i G / 1 (y'l ) n/ l {h)< then 7i =/(0 =i2> and so, because/ is a function,/ =y 2 . 
Therefore, f~ l (j\) and/ -1 ^) are disjoint whenever 71 =^ 72. Moreover,/ is onto if 
and only if f~ l (j) 7^ 0 for ally € {1, 2, ... , «}. Let's summarize. 

2.1.16 Lemma. The function f e F mn is onto if and only if Equation (2.3) is a 
partition of D = {1,2,..., m}. 

2.1.17 Definition. The number partitions of {1,2, ...,m} into n blocks is 
denoted S(m, n) and called a Stirling number of the second kind. 

Evidently, S(m, n) = 0 if n < 1 or n > m. Because there is just one way to 
partition {1,2, ... ,m} into a single block and {1} U {2} U • • • U {m} is the unique 
(unordered) way to express it as the disjoint union of m nonempty subsets, 
S(m, 1) = 1 = S(m,m). 

2.1.18 Example. In Example 2.1.15 we saw, e.g., that 5(3,2) = 3. the two- 
block partitions of {1,2,3,4} are 

{1}U{2,3,4}, {2}U{1,3,4}, {3}U{1,2,4}, {4}U{1,2,3}, 
{1,2} U {3,4}, {1,3} U {2,4}, and {1,4} U {2,3}, 

so 5(4,2) = 7. The three-block partitions of {1,2,3,4} are 

{1}U{2}U{3,4}, {1}U{3}U{2,4}, {1} U {4} U {2, 3}, 
{2}U{3}U{1,4}, {2}U{4}U{1,3}, and {3} U {4} U {1, 2}. 

So, 5(4, 3) = 6. □ 
Onto functions and Stirling numbers come together in the next result. 

2.1.19 Theorem. The number of onto functions in F mn is n\S(m,n). 

Proof. If n > m, there are no w-part partitions of {1,2, ... ,m} and no onto 
functions in F mfl . When n < m, the theorem is proved by establishing a many- 
to-one correspondence between onto functions and w-block partitions. 

By Lemma 2.1.16, each onto function / G F m n affords a unique partition, 
namely, /~'(1) U/~'(2) U • • • U /"'(«). Indeed, from the perspective of /, this is 
an ordered partition. For onto function / e F m ,„ to afford partition A\ U A2 U • • • U 
A„, it isn't necessary for A\ to be/ _1 (l). Since partitions are unordered, the block 
A\ could just as well be f~ 1 (j) for any j G {1,2, ... ,n}. There are n ways to choose 
an integer j\ to satisfy A\ =/~ 1 (y' 1 ), n — 1 ways to choose y'2 so that A2 = f~ l (h)i 
n — 2 ways to choose j'3, and so on. Evidently, each of the S(m, n) w-block partitions 
of {1, 2, ... ,m} can be arranged in n\ ways, corresponding to the ordered partitions 
afforded by n\ different onto functions / G F„ hn . ■ 

Named for James Stirling (1692-1770). The terminology suggests the existence, at the very least, of 
Stirling numbers of the first kind. 
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n 1 2 3 4 5 6 7 



0! 
1 

2 
3 
4 
5 
6 
7 



1 

3 
7 

5(5,2) 
5 (6,2) 
S (7,2) 



From Example 2.1.18, 5(3,2) = 3, 5(4,2) = 7, and 5(4,3) = 6. Together with 
S(m, 1) = 1 = S(m, m) 7 m > 1, this gives us a start at filling in some of the entries 
of Stirling's triangle (Fig. 2.1.1). 

2.1.20 Theorem. If m >n>2, then S(m+ I, it) = S(m,n — 1) +nS(m,n). 

Theorem 2.1.20 allows us to fill in as many rows of Fig. 2.1.1 as we like, 
e.g., 

5(5,2) -5(4,1) + 25(4, 2) 
=1+2x7 
= 15 

5(5,3) =5(4, 2) + 35(4, 3) 
=7+3x6 
= 25, 

5(5,4) =5(4, 3) +45(4, 4) 
=6+4x1 
= 10. 



Thus, we obtain Fig. 2.1.2. 



1 

6 1 

S (5,3) S (5,4) 

S (6,3) S (6,4) 

S (7,3) S (7,4) 

Figure 2.1.1. Stirling's triangle. 



1 

S (6,5) 1 

S (7,5) S (7,6) 1 



Proof of Theorem 2.1.20. The w-block partitions of T = {1, 2, . . . , m, m + 1} 
can be divided into two types, those for which m + 1 is alone is its block and those 
for which it isn't. Counting partitions of the first type is easy: If {m + 1} is a block 
of the partition, then the remaining m elements of T can be partitioned into n — 1 
blocks in S(m, n — 1) ways. 

If m + 1 is not isolated, then removing m + 1 from its block produces an n-part 
partition of {1,2, ... ,m}, say, A\ U A2 U • • • U A n . Now, this same partition would 



The Combinatorics of Finite Functions 

2 3 4 5 6 7 



1 
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1 1 
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1 3 


1 








4 


1 7 


6 


1 






5 


1 15 


25 


10 


1 




6 


1 31 


90 


65 


15 


1 


7 


1 63 


301 


350 


140 


21 



Figure 2.1.2. Stirling numbers of the second Kind, S(m,n). 



arise if m + 1 had been removed from any one of the blocks A,, 1 < i < n. In other 
words, to each w-part partition of {1,2, ... ,m} there correspond n different «-part 
partitions of T of the second type, i.e., there are (exactly) nS(m, n) partitions of T in 
which m + 1 shares its block with at least one other integer. ■ 

2.1.21 Example. Observe that 



2+1 


= 3 


is prime, 


2x3 + 1 


= 7 


is prime, 


2x3x5+1 


= 31 


is prime, 


2x3x5x7+1 


= 211 


is prime, 


2x3x5x7x11 + 1 


= 2311 


is prime, 



but (maybe 13 is unlucky) 

2x3x5x7x 11 x 13 +1= 30,031 

= 59 x 509 

is not. However, because 59 and 509 are primes, this is the only nontrivial factor- 
ization of 30,03 1 . By way of comparison, if its immediate predecessor 

30,031 - 1 = 30,030 

= 2x3x5x7x11x13 
= dq, 

where 1 < d < q, then the prime factors of d and q correspond to the blocks of a 
two-part partition of {2,3,5,7,11,13}. Moreover, because partitions are unor- 
dered, this correspondence is one-to-one, i.e., 30,030 has (exactly) 5(6,2) = 31 
different factorizations as a product of two integers each greater than 1. □ 
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2.1. EXERCISES 



1 Find/(3) and/- '(4) if 
(a)/ =(4, 1,5). 
(c)/= (3,3,4,4). 
(e)/=(8,2). 



(b)/ 
(d)/ 
(0/ 



(9,4,5,4,2). 

(4,3,2,1). 

(4,4,4,4,4). 



2 Compute 

(a) o(F 5;3 ). (b) o(F 3 , 5 ). (c) o(F M ). 
(d) o(e s , 3 ). (e) 0(63,5)- (f) o(Gm)- 

3 Write down all the one-to-one functions in 
(a) F 2 , 3 . (b) F 3j3 . (c) F 4j3 . 

(d) fi2,3. (e) 23,3- (D 23,4- 

4 Write down all the onto functions in 
(a) F 3j2 . (b) F 3]3 . (c) F 3j4 . 

(d) 22,3- (e) 23,3- (D 24,4- 

5 Compute S(m,n), I < n < m, 8 < m < 9. 

6 Show that 

(a) S(n+ l,n) = C(n+ 1,2). 

(b) S(n + 2,n) = C(n + 2, 3) + 3C(n + 2, 4). 

(c) S(n+1,2) = 2"-1. 

7 Suppose n = p\pi ■ ■ -p r , where p\,p2, ■ ■ ■ ,Pr are distinct primes and r > 2. 
Prove that n can be factored as n = dq, where 1 < d < q, in exactly S(r, 2) 
different ways. 

8 Prove that 



9 Between Q mn and F mn is G m> „, the set of nondecreasing functions, i.e., 
/ G G ffl; „ if and only if 1 </(l)'</(2) < • • • </(m) < n. 

(a) List the elements of 

(b) List the elements of G 3 3 . 

(c) Prove that o(G mj „) = C(m + n — l,m). 

10 The homogeneous symmetric function H m (x\,X2, ■ ■ ■ ,x„) was introduced in 
Exercise 25, Section 1.8. It is the sum of all C(m + n— l,m) different 
monomials of degree m in the variables x\,X2,.. . ,x n . 



m 
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(a) Show that 

m 

H m (xi,x 2 ,...,x„)= ^2 Yi x f(>)' 
feG m „ i=l 

where G Mt „ is the set of nondecreasing functions (sequences) defined in 
Exercise 9. 

(b) Use part (a) to compute 7/2(1, 2, 3). 

(c) Show that #3(1, 2, 3) = 90. 

(d) Without evaluating any of the three terms, show that 7/3(1,2,3,4) = 
//3(1,2, 3) + 4H 2 (1,2, 3, 4). 

11 Let H m (xi,X2, . . ■ ,x„) be the homogeneous symmetric function from Exercise 
10. 

(a) Prove that H m+l (xi,x 2 , ■ ■ . ,x n ) = H m+l (x u x 2 , . . . ,x„_i) + x„H m (xi, 

X2 1 • • • 1 %n ) ? W ^ 2. 

(b) Define h(m,m) = 1 and h(m,n) = 7/ m _„(l,2, ...,«), m > n. Prove that 
ft(m + l,n) = h(m 7 n — 1) + nh(m,n), m > n > 2. 

(c) Prove that S(m,n) = 7/ m _„(l,2, ...,«), m > n > 1. 

(d) Prove that 

r 

5(n + r,n)= ^ JJ/(i). 

/eG r ,„ 1=1 

12 The image of / G F mj „ is 

image (/) = {/(*) : x G {1,2, . . . ,m}}, 

i.e., image (/) is the set of numbers that occur in the sequence (f(l), 
/(2), . . . ,f(m)). Prove that the number of functions f € F mfl that satisfy 
o(image(/)) = f is n\S(m,t)/(n — t)\. 

13 Prove the following analog of Chu's theorem: 

m 

S(m+l,n+ 1) =^2c(m,k)S(k,n). 

k=n 

14 In how many ways can 30,030 be factored as a product of three integers, 
a x b x c, where 1 < a < b < cl 

15 Organize the set of area codes (4, 1, 5), (2, 1, 3), (2, 1, 2), (2, 0, 5), (2, 0, 2), 
(7, 0, 7), (4, 0, 5), (8, 0, 5), and (8, 1, 8) into dictionary order. 

16 A substitution code encrypts ordinary text messages by uniformly replacing 
each letter with a substitute. Among the simplest of these are the Caesar 
cypher's, in which each letter is replaced by the one coming n places after it 
(or before it if n is negative) in alphabetical order. In the Stanley Kubrick film 
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2001: A Space Odyssey, the computer's name, HAL, is a Caesar cypher for 
IBM, corresponding to n = — 1 . (It has been said that an early Roman emperor 
amused himself by handing the following note to a messenger and ordering 
him to carry it to the local military commander: "JHKK SGD ADZQDQ NE 
SGHR MNSD.") 

Code breaking frequently involves the notion of a word pattern. The pattern 
WXYZXW, for example, is common to several English words, e.g., EVOLVE, 
LARVAL, READER, RENTER, SERIES, and TIDBIT. (Note that REGRET 
exhibits a different pattern, namely, WXYWXZ.) There are no English words 
with pattern WWWWWW (the same as pattern XXXXXX) nor, for that 
matter, with pattern WWXYY (the same as QQALL).* 

Denote by T(m, n) the number of different m-letter word patterns that use a 
total of n different letters. (Then, e.g., r(3, 1) = 1, 7(3,2) = 3, and 

r(3,3) = i.) 

(a) Compute T(4,n), 1 < n < 4. 

(b) There are two four-letter English words having word pattern XYXX. Find 
one of them. 

(c) Show that T(m, 1) = 1 = T(m,m). 

(d) Prove that the array of word pattern numbers is identical to the array of 
Stirling numbers of the second kind, i.e., for all positive integers, m, 
T(m, n) = S(m, n), 1 < n < m. 

17 Let S = {1, 2, 3, 4, 5}. In how many partitions of S will 

(a) 1 and 2 be in the same block of 5? 

(b) 1 and 2 be in different blocks of S? 

18 Write an algorithm/program to generate and list, in dictionary order, 

(a) all the one-to-one functions in F44. (Hint: How is this different from 
listing all 4! rearrangements of 1234?) 

(b) all 4 4 functions in F44. (Hint: Start from scratch.) 

(c) all five functions in Q4 5. 

(d) all C(6, 4) functions in Q 4t6 . 

(e) all C(6, 4) four-element subsets of {1,2,3,4,5,6}. 

19 Write an algorithm/program to input x,, 1 < i < 6, and output 

E 4 (xi,x 2 , . . . ,x 6 ). 

20 Denote by Sk(m,n) the number of partitions of {1,2, ... ,m} into n blocks 
each of which contains at least k elements. Show that 

(a) S(m,n) = Si(m,n). 

(b) Sk{m + l,n) = C(m, k — \)Sk{m — k + l,n — 1) + nSk(m, n). 



*See, e.g., S. W. Golomb, On the enumeration of cryptograms, Math. Mag. 53 (1980), 219-221. 
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21 Let 

Gm.n C Fm,n be the set of nondecreasing functions from Exercise 9. 
Compute o({f G G m , n : o({j : ^(/-'(i)) > 2}) - 1}). 

22 There is an analog of the fundamental theorem of symmetric polynomials 
(Appendix Al) for the homogeneous symmetric functions of Exercise 10: Any 
polynomial symmetric in the variables X\,X2, ■ ■ ■ ,x„ is a polynomial in the 
homogeneous symmetric functions H m (x\,X2, . . . ,x n ), 1 < m < n. 

(a) Show that the elementary symmetric function E2(x,y, z) — H\{x,y, z) 2 — 
H2(x,y,z). 

(b) Show that the second power sum M2(x,y,z) = 2// 2 (x, v,z) — H\{x, y, z) 2 . 

(c) Express E^(x,y,z) as a polynomial in H m (x,y,z), 1 < m < 3. 

(d) Express Mj,(x,y, z) as a polynomial in H m (x,y,z), 1 < m <3. 

(e) Express M^(x,y,z) as a polynomial in H m (x,y,z), 1 < m < 3. 

23 Let H m = H m (x, y, z) be the homogeneous symmetric function of Exercise 10. 
For each partition n h 4, let = M^^,}?, z). Show that 

(a) H\ = M [4] + 4Mp,i] + 6M| 2 2, + 12M [2jl2] . 

(b) H\H 2 = M| 4| + 3M [3jl] + 4M, 2 2] + 7M [2il2] . 

(c) Hi = M |4 ] + 2M [3]1] + 3M| 2 2, + 4M [2il 2]. 

(d) //!// 3 = M| 4| + 2M [3jl] + 2M| 2 2, + 3M| 24 2]. 

24 An equivalence relation on S — {1,2,3,4,5,6,7} partitions the set into the 
disjoint union of equivalence classes. 

(a) Show that every partition of S corresponds to the family of equivalence 
classes for some equivalence relation. 

(b) How many different equivalence relations on S are there? 

25 Write an algorithm/program to compute S(m,n), 1 < m < 12, 1 < n < m. 



2.2. BELLS, BALLS, AND URNS 

Heard melodies are sweet, but those unheard are sweeter. 

— John Keats (Ode on a Grecian Urn) 

Recall that the falling factorial function is defined by x' 0 ' = 1 and 

=x{x- \){x-2)---{x-n) 1 n>0. 
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If m > 1, then x' m ' is a polynomial of degree m whose roots are 0, 1, 2, ... ,m — 1. 
Thus, 

x (m) = y*_ e ( m _ 1; 1)^-1 +e ( m _ l,2)x m - 2 + (-l) ffl ~ 1 e (m- l,m- l)x, 

(2.4) 

where the elementary number e(m — 1, r) = E r (l, 2, . . . , m — 1) = £V(0, 1,2,..., 
m — 1), 1 < r < m. 

2.2.1 Example. Let's confirm Equation (2.4). From Fig. 1.9.2, 

x< 3 > =x 3 -e(2, l)jt 2 + e(2,2)x 
= x 3 - 3x 2 + 2x 

when m = 3 and 

x (4) = x 4 - e(3, l)x 3 + e(3, 2)x 2 - e(3, 3)x 
= x 4 - 6x 3 + 1 lx 2 - 6x 

when m = 4. On the other hand, from the definition, 

x< 3 ) =x(x- l)(x-2) 
= x(x 2 — 3x + 2) , 

and 

x (4) =x(x- l)(x-2)(x-3) 
= (x 2 -x)(x 2 -5x + 6) 

= x 4 -(l + 5)x 3 + (5 + 6)x 2 -6x. □ 

Equation (2.4) is an explicit expression of the (obvious) fact that x' m ' is some 
linear combination of x,x 2 ,x 3 , . . . ,x m . On the other hand, because x' r ' has degree 
r, it must also be the case that x'" is some (unique ) linear combination of 
x^jX^jx' 3 ), . . . ,x' m '. More remarkable is the fact that the coefficients in this 
inverse expression are Stirling numbers of the second kind! 

2.2.2 Theorem. For any positive integer m, 

m 

x m = ^S(m,r)*M. 

r=l 



See, e.g., Exercise 28, Section 1.5, or Exercise 24, Section 1.9. 
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Let's confirm this identity when m = 4. Together with the fourth row of Fig. 2.1.2 
(read backward!), Theorem 2.2.2 yields 

x 4 = 5(4,4)x< 4 ' +S(4,3)x (3) + 5(4, 2)x< 2 > +5(4, 

= (x 4 - 6x 3 + 1 lx 2 - 6x) + 6(x 3 - 3x 2 + 2x) + 7(x 2 - x) + x 
= x 4 + (-6 + 6)x 3 + (11 - 18 + 7)x 2 + (-6+ 12-7 + l)x. 

Proof of Theorem 2.2.2. Because 5(1, 1) = 1 and x = x' 1 ', the m = 1 case is 
trivial. If m > 1, then, by induction, 

x m =x m - 1 -X 

= ^Ts(m- l,r)x^x 

m— 1 

= ^5(m- l,r)[x< r 'x]. (2.5) 

r=l 

Because x = (x — r) + r, 

x (r) x = x (r)( x -r)+x^r 
= x( r+1 > + rx«. 

Substituting this identity into Equation (2.5) and reorganizing, we obtain 

m— 1 m— 1 

x m = J2 S (m~ l,r)x( r+1 ' + ^r5(m- l,r)*M. 

r=l r=l 

Changing the variable in the first summation yields 

m m— 1 

x m = ^5(m- l,r- l)xW + ^r5(w- l,r)xW 

r=2 r=l 

m— 1 m— 1 

= x< m > + ^5(m- l,r- l)x< r ' + ^ r5(m - l,r)*M +X W 

r=2 r=2 
m-1 

= x< ffl > +^[5(m- l,r- l) + r5(m- l,r)]xM + *« 

r=2 

m 
r=l 

because 5(m, r) = S(m — 1, r — 1) + r5(m — 1, r), 2 < r < m — 1. ■ 
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2.2.3 Corollary. For all positive integers k and m, 

m 

k m = ^r\S{m,r)C{k,r). (2.6) 

Proof. Because Theorem 2.2.2 is a polynomial identity, we can substitute any 
number we like for x Setting x = k gives 

m 

k m = J2s(m,r)P(k,r) 

m 

= ^r\S(m,r)P(k, r)/r\ 

m 

= Y,r\S{m,r)C{k,r). 

r=\ ■ 

Recall the approach that was used in Section 1.5 to obtain a formula for the sum 
of the rath powers of the first n positive integers. If 

m 

^^« r ,»,C(V), (2.7) 

then, by Equation (1.10), 

m 

1» + 2™ + ■ ■ ■ + n m = a r,mC( n +hr+l). 

Inverting the n x n Pascal matrix C„ whose 0', /(-entry is C(iJ), we obtained 
(Theorem 1.5.5) the unique solution 

m 

a r , m = J2(-lY + 'C(r,t)t m . (2.8a) 
t=i 

It follows from Equations (2.6) and (2.7) that 

a rM = r\S{m,r). (2.8b) 

Two conclusions can be drawn from these observations. The first is a new formula 
for the sum of the rath powers of the first n positive integers, namely, 

m 

l m + 2'" + ■ ■ ■ + n m = r\S{m, r)C(n + 1, r + 1). (2.9) 

r=l 

The second is a new formula for the number of onto functions in F m r . 
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2.2.4 Corollary (Stirling's Identity*). For any two positive integers m and r, 



H5(m,r)-^(-l)'' + 'C(r,f)r. 
t=\ 



Proof. Equations (2.8a) and (2.8b). ■ 

2.2.5 Example. For 1 < r < m = 4, Stirling's identity produces 

S(4,1) = C(1,1)1 4 
- 1, 

5(4,2) = ±[-C(2, 1)1 4 + C(2,2)2 4 ] 

= I[-2+16]=7, 
5(4, 3) = 1 [C(3, 1)1 4 - C(3, 2)2 4 + C(3, 3)3 4 ] 

= I [3 -48 + 81] =6, 
5(4, 4) = i [-C(4, 1) l 4 + C(4, 2)2 4 - C(4, 3)3 4 + C(4, 4)4 4 ] 

= if h 4 + 96 - 324 + 256] = 1. 

While its usefulness to computing S(m, r) may be restricted to r < m, Stirling's 
identity remains valid when r > m. If r = 4 and m = 3, e.g., 

4!5(3, 4) = -C(4, 1) l 3 + C(4, 2)2 3 - C(4, 3)3 3 + C(4, 4)4 3 
= -4x1+6x8-4x27+1x64 
= -4 + 48 - 108 + 64 
= 0. 

Indeed, Stirling's identity implies that 

C(r, r)r» - C(r, r - l)(r - l) m + • • • + (-l) r+1 C(r, l)l m = 0 
for all r > m. □ 

Recall that the rath-row sum of the partition triangle is Y11=\Pr{ n ) = p( M )> me 
total number of partitions of n. Similarly, YTr=\ S( n i r ) * s me tota l number of 
partitions of { 1 , 2, . . . , «}. 

2.2.6 Definition. The Bell numbers^ are defined by Bq = 1 and 

n 

B n = J2s(n,r), n>\. 

r=l 

*Not to be confused with Stirling's formula: n\/n" =\/2nn/ e" . 
f After Eric Temple Bell (1883-1960). 
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From Figure 2.1.2, the Bell sequence (starting with fi 0 ) is 1, 1, 2, 5, 15, 52, 203, 
877, .... 

2.2.7 Theorem. The Bell numbers satisfy the recurrence 

n 

=]Tc(«,r)B r . (2-10) 



Equation (2.10) is reminiscent of the binomial theorem. Changing each subscript 
to a superscript gives a (nonsensical) way to remember Equation (2.10) : 

n 

B n+l =Y,C{n,r)B r = {B + \ f . 

Proof of Theorem 2.2.7. In any partition of {1,2, ... ,n,n+ 1}, the number 
n + 1 belongs to a unique block. Apart from n + 1 itself, this block contains 
some k other elements, where 0 < k < n. Because the k companions of n + 1 
can be chosen from {1, 2, . . . , n} in C(n, k) ways and the remaining n — k elements 
can be partitioned into blocks in B n ^ ways, the number of partitions in which n + 1 
belongs to a block with k other elements is C(n, k)B n ^. Summing over r = n — k 
yields 

n n 

B n+l =J2c(n,n- r)B r = Y,C(n,r)B r . 

r=o r=0 

Among other things, the Bell numbers enumerate equivalence relations. 

2.2.8 Definition. Let S be a set. A binary relation ~ on S is an equivalence 
relation if it satisfies three properties: 

1. x ~ x for all x e S; 

2. if x ~ y, then y ~ x; 

3. if x ~ y, and y ~ z, then x ~ z. 

2.2.9 Theorem. Ifo(S) = n, then the number of different equivalence relations 
on S is the nth Bell Number B„ . 

Proof. If s € 5, the equivalence class to which s belongs is {x G S : 5 ~ x}. Two 
equivalence classes are either disjoint or identical. In paticular, the different equiva- 
lence classes comprise a partition of S. Conversely, any partition of S is the family 
of equivalence classes for some equivalence relation. Thus, the number of equiva- 
lence relations is equal to the number of partitions of S. ■ 

Turning to other applications, there is a family of problems (somewhat analo- 
gous to the four "choosing" problems of Section 1.6) that are traditionally stated 
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in terms of balls and urns. The general problem involves the question, "In how 
many different ways can m balls be distributed among n urns?" The answer 
depends upon how the word "different" is interpreted. It may be, for example, 
that among the balls are Ping-Pong balls, golf balls, baseballs, and volleyballs. 
The urns might come in red, white, or blue versions. More formally, we would 
like to be able to allow for the possibility of equivalence relations on the sets of 
balls and urns. 

For now, we adopt an "all-or-nothing" attitude. Either the balls are all equiva- 
lent or all inequivalent and, independently, the urns are all identical or all different. 
In this context, the words labeled and unlabeled are useful. If we can't tell the balls 
apart, we'll say they are unlabeled; if the urns are all different from each other, 
we'll say they are labeled. (In all cases, we presume that balls and urns can 
be distinguished from each other!) Another consideration is whether to allow 
some of the urns to wind up empty. So, at this stage, there are eight variations of 
the problem. 

Let's begin with the two cases in which the balls are labeled but the urns are not. 
It really doesn't matter how the balls are labeled as long as the labels suffice to 
distinguish one ball from another. So, we may as well suppose the balls are labeled 
with the numbers 1, 2, . . . ,m. 

Variation 1. In how may ways can m labeled balls be distributed among n 
unlabeled urns if no urn is left empty? Stripping away the colorful terminology 
of balls and urns, this is just asking in how many ways the set {1,2, ... ,m} can 
be partitioned into n blocks. The answer is S(m, n). 

2.2.10 Example. In how many ways can four labeled balls be distributed among 
two unlabeled urns if no urn is left empty? According to Variation 1, the answer the 
5(4,2) = 7. If the balls are labeled 1, 2, 3, and 4, then the seven possibilities are 

{1}& {2,3,4}; {2} & {1,3,4}; {3} & {1,2,4}; {4} & {1,2,3}; 
{1,2} & {3,4}; {1,3} & {2,4} and {1,4} & {2,3}. 

(Because the urns are unlabeled, {1} & {2,3,4} is the same as {2,3,4} & {1}. 
Since it is a set, {2, 3, 4} = {3, 4, 2}.) □ 

Variation 2. In how many ways can m labeled balls be distributed among n 
unlabeled urns? Since it is no longer a requirement that no urn be left empty, 
this is the same as asking for the number of ways in which {1,2, ... ,m} can be 
partitioned into n or fewer blocks. The answer is 

S(m, 1) + S(m, 2) + • • • + S(m, n). 

(When m < n, this sum is the mth Bell number B m .) 

* Since the balls are free to roll around in the urns, the order in which the balls are distributed among the 
urns doesn't matter. 
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2.2.77 Example. The number of ways to distribute four labeled balls among two 
unlabeled urns is 5(4, 1) + 5(4,2) = 1 + 7 = 8. In addition to the 7 possibilities 
listed in Example 2.2.10, we have {1,2,3,4} & {}, the case in which one of the 
urns winds up empty. (Because the urns are indistinguishable, the question of which 
urn is left empty does not arise.) □ 

Turning to the cases in which the balls are labeled 1,2, ... ,m and the urns are 
labeled 1,2, ... ,n, each distribution of balls among urns is uniquely described by a 
function / G F mjB , where /(/) = j is interpreted to mean that the ith ball is assigned 
to the y'th urn. 

Variation 3. In how many ways can m labeled balls be distributed among n 
labeled urns? The answer is just o(F„ un ) = n m . 

2.2.12 Example. Four labeled balls can be distributed among two labeled urns 
in 2 4 = 16 ways. Indeed, now that the urns can be distinguished (maybe one of 
them is chipped), why not just double the answer from Example 2.2.11? What if 
there were three labeled balls and three unlabeled urns? Then, by Variation 2, there 
would be 7?3 = 5 ways to distribute the balls and 3 ! x 5 = 30, whereas the 
correct answer for the number of ways to distribute three labeled balls among three 
labeled urns is 3 3 = 27. In this case, the four unlabeled solutions 

{1}&{2,3}&{}; {2}&{1,3}&{}; {3} & {1, 2} & {} : 
and {1} & {2} & {3} 

each have six labeled counterparts, while {1,2,3} & {} & {} has only three. 

□ 

Variation 4. In how many ways can m labeled balls be distributed among n 
labeled urns if no urn is left empty? The answer is n\S(m,ri), the number of onto 
functions in F m ^ n . 

This time, the obvious shortcut is valid. By Variation 1, there are S(m, n) ways to 
distribute m labeled balls among n unlabeled urns. Once the balls have been distrib- 
uted, there are n\ ways to label the urns. By the fundamental counting principle, the 
answer we seek is n\S(m,n). 

In fact, there is a third approach to Variation 4. While it could not be called a 
shortcut, it is useful in another way. Let's begin with an example. 

2.2.13 Example. In how many ways can five labeled balls be distributed among 
three labeled urns if no urn is left empty? As an example of Variation 4, the answer 
is 3!5(5, 3). But, consider the following alternate approach: Label the balls 1, 2, 3, 
4, 5 and the urns 1, 2, 3. As before, we describe a distribution of balls among urns 
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by means of a function / G F5 3, but this time concentrate on the sequence 

/=(/(l),/(2),/(3),/(4),/(5)). 

For our present purposes, it is useful to abbreviate the sequence, writing it in the 
form 

/=/(l)/(2)/(3)/(4)/(5). 

Thus, e.g.,/ = 31121 is the assignment of ball 1 to urn 3, ball 4 to urn 2, and balls 
2, 3, and 5 to urn 1. While 31121 may look like a number, we are going to view it 
as a word. There is a one-to-one correspondence between assignments of balls to 
urns and five-letter words produced from the alphabet {1, 2, 3}. Moreover, assign- 
ments leaving no urn empty correspond to words that use all three "letters". 

Let's examine some possibilities. If all three letters are used, the maximum 
multiplicity any one letter can have is three, and then only when each of the other 
two letters occurs exactly once. Just to get warmed up, how many five-letter words 
use 1 three times and each of 2 and 3 just once? The answer is multinomial 
coefficient (j^). Similarly, the number of five-letter words that use three 2's, 
one 1, and one 3 is (j^); and d j 3 ) is the number that use three 3's, one 1, 
and one 2. 

If no "letter" occurs as often as three times, then the only possibility is that one 
of the letters occurs once and the other two occur twice. Since the letter used only 
once can be any one of 1, 2, or 3, the number of possibilities of this type is 

( 1,2,2) + (2,1,2) + (2,2,1 )• 

Putting it all together, we obtain the identity 

3!5(5 ' 3) ~ .(3,l,l) + (l,3,l) + (1,1,3). 

+ [(l,2,2) + ( 2 ,l, 2 ) + ( 2 ,2,l).- 

While the two sides of this equation may look different, they had better not be 
different. Indeed, 6 x 25 = 3 x 20 + 3 x 30. □ 

Example 2.2.13 can be generalized as follows. 
2.2.14 Theorem. Ifm>n, then the number of onto functions in F m ^ n is 

n\S{m,n) = Y j ( m ), 
^\r u r 2 ,...,r n J' 

where the summation is over all n-part compositions of m, i.e., over those multino- 
mial coefficients having exactly n positive integers in the bottom row. 
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2.2.75 Example. Apart from (1,1,1,1) and (2,2,2,2), the remaining 2 4 - 2 = 14 
functions in F4.2 are onto. To confirm Theorem 2.1.19, observe that 
2!5(4,2) = 2 x 7 = 14. To confirm Theorem 2.2.14, observe that (3*) + 



The number of ways to distribute m unlabeled balls among n labeled urns 
U\, U2, ■ ■ . , U n is the number of solutions to the equation 



in positive integers (no empty urns) or nonnegatve integers (empty urns allowed). 
The number of ways to distribute m unlabeled balls among n unlabeled urns is the 
number of w-part partitions of m (no empty urns) or the number of partitions of m 
into at most n parts (empty urns allowed). Details are left to the exercises. 



2.2. EXERCISES 

1 Confirm 

(a) Equation (2.4) when m = 5. 

(b) Theorem 2.2.2 when m = 5. 

(c) Corollary 2.2.4 when m — 5 and 1 < r < 5. 

(d) Theorem 2.2.14 when m = 5 and 1 < n < 5. 

2 Confirm that the m = 4 case of Corollary 2.2.3 is identical to Equation (1.12). 
{Hint: Fig. 2.1.2.) 

3 In the spirit of Exercise 2, explicitly compute the numbers 

(a) fl r , 5 = r\S(5,r), 1 < r < 5. (b) a rf> , 1 < r < 6. 

4 Use Stirling's identity (Corollary 2.2.4) to 

(a) confirm that 5(6,2) =31. 

(b) confirm that 5(7, 2) = 63. 

(c) compute 5(8, 2). 

(d) prove that 5(m,2) = 2 m ~ l - 1. 

5 Compare and contrast Stirling's identity (Corollary 2.2.4) with 




□ 



Mi + 112 + ■ ■ ■ + u n = m 



(t-ir = Y,(-l) m - r C(m,r)f. 



6 Show that B n = £" =1 Y!t=\(- l Y + ' c ( r i t)f/r\. (Hint: B n is a sum of Stirling 
numbers.) 
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7 From Figure 2.1.2, the first eight Bell numbers (starting with Bq) are 1, 1, 2, 5, 
15, 52, 203, and 877. Use these data to confirm that B 7 = C(6,0)fi 0 + 
C(6,l)fli + -" + C(6,6)fl 6 . 

8 Denoting the wth Bell number by B n , 

(a) compute fi 8 . (b) show that B g = 21, 147. 

9 Explain how the identification of the word pattern number T(m, n) with 
the Stirling number S(m,n) in Exercise 16(d), Section 2.1, follows from 
Theorem 2.2.14. 

10 How many of the equivalence relations on {1,2, ...,«} afford exactly k 
equivalence classes? 

11 Confirm Theorem 2.2.14 when 

(a) m = 6 and n = 2. (b) m = 6 and n = 3. 

12 Use Exercise 12, Section 2.1, as the basis of a new proof of Corollary 2.2.3. 

13 In how many ways can five identical black ceramic Maltese falcons be 
distributed among the Turnage brothers Bill, Jim, and Robert? 

14 Prove that m unlabeled balls can be distributed among n labeled urns in exactly 
C(m + n — l,m) different ways. {Hint: Label the urns U\, U2, . . . ,U„. Let m,- 
be the number of balls that wind up in urn {/,-.) 

15 Prove that m unlabeled balls can be distributed among n labeled urns, leaving 
no urn empty, in exactly C(m — l,n— 1) different ways. 

16 In how many ways can five identical balls be distributed among three 
unlabeled urns? (Hint: Some urn, since they are not labeled it doesn't matter 
which one, getting all five balls is one way. One urn getting four balls and 
another getting one is a second way.) 

17. In how many ways can five identical grapefruits be distributed among three 
unlabeled boxes if no box is left empty? 

18 Prove that m unlabeled balls can be distributed among n unlabeled urns in 
Pi(m) + piipi) + ■ ■ ■ + p n (m) ways, where Pk(m) is the number of £-part 
partitions of m. 

19 If m < n, show that m unlabeled balls can be distributed among n unlabeled 
urns in p(m) ways, where p(m) is the number of partitions of m. 

20 In how many ways can six balls be distributed among four urns if 

(a) the urns are labeled but the balls are not? 

(b) the balls are labeled but the urns are not? 

(c) both balls and urns are labeled? 

(d) neither balls nor urns are labeled? 



2.2. Exercises 139 

21 Rework Exercise 20 under the condition that no urn is left empty. 

22 In how many ways can 10 theatre tickets be distributed among the Turnage 
brothers Robert, Jim, and Bill if the tickets 

(a) are for specific seats in the auditorium? 

(b) are for admission to the auditorium where seating is on a first-come, first- 
served basis? 

23 Given ten unlabeled balls and four unlabeled urns, in how many ways can the 
balls be distributed among the urns if no urn is left empty? 

24 In how many ways can nine balls be distributed among five urns if no urn is 
left empty and 

(a) the balls are labeled but the urns are not. 

(b) neither the balls nor the urns are labeled. 

(c) the urns are labeled but the balls are not. 

(d) both the balls and the urns are labeled. 

25 Rework Exercise 24 when empty urns are permitted. 

26 Fill in the blanks (with actual numbers, as opposed to names for numbers). 

(a) x 5 = x< 5 ' + _ x< 4 » + _ x< 3 ' + _ x< 2 ' + _ x+_. 

(b) n 5 = P(n, 5) + _ P(n, 4) + _ P(n, 3) + _ P(n, 2) + _ P(n, 1). 

(c) n 5 = _ C{n, 5) + _ C(n, 4) + _ C(n, 3) + _ C(n, 2) + _ C{n, 1). 

27 Suppose m houses are to be painted. Assume that x colors are available but that 
each house is to be uniformly painted just one color. The houses are labeled 
(by their street addresses) and the colors can be distinguished from each other 
(so they are labeled too). 

(a) Show that the number of ways to paint the m houses using exactly r of the 
x colors is S(m, r)x^ . 

(b) In how many ways can the m houses be painted using m or fewer of the x 
colors? 

(c) Give a combinatorial proof of Theorem 2.2.2. 

28 In the Bose-Einstein model of statistical mechanics, each of r identical 
particles can have any one of k different energy levels. 

(a) How many energy states can such a system exhibit? 

(b) Suppose there are six particles and four energy levels. Assuming all the 
states are equally likely, what is the probability that all six particles will 
have the same energy? 

29 Suppose U\ , U2, ■ ■ ■ , U„ are (labeled) urns and r\ + r-i + ■ ■ ■ + r n = m is an 
w-part composition of m. In how many ways can m labeled balls be distributed 
among the urns so that urn U k receives exactly r k balls, 1 < k < nl 
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30 The number of different (unordered) ways to express 30 as a sum of (one or 
more) integers greater than zero is p(30) = 5604. 

(a) List the Bj, = 5 different (unordered) ways to express 30 as a product of 
(one or more) integers each of which is greater than 1 . 

(b) Show that there are 203 (unordered) ways to write 30,030 as a product of 
(one or more) integers greater than 1. 

31 Suppose n = p\p2 ■ ■ -p r , where pi,p2, ■ ■ ■ ,Pr are r different primes (so n is 
"square free"). Let Yl(n) be the number of different (unordered) ways to write 
n as a product of (one or more) integers greater than 1. Prove that Yl(n) = B r , 
the rth Bell number. 

32 Rephrase the astragali problem from Exercise 26, Section 1.6, in terms of balls 
and urns. 

33 Let n be a positive integer and p a (positive) prime that is not a factor of n. 
Those having some acquaintance with congruences will recognize that 
n p ^ 1 = 1 (mod p) is a consequence of Fermat's little theorem (Section 1.7, 
Exercise 11 (b)). Use this result, along with Stirling's identity, to prove 
Wilson's theorem: (p — 1)! = —1 (mod p). 

34 In how many ways can 24 students be evenly divided into six "teams" 

(a) if the teams are "labeled". 

(b) if the teams are "unlabeled". 

35 In how many ways can 10 students be divided into three "teams" if each team 
has at least three students and 

(a) the teams are "labeled". 

(b) the teams are "unlabeled". 

36 Suppose balls 1, 2, 3, 4, and 5 are distributed randomly among three urns. 
Compute the probability that no urn is left empty if 

(a) the urns are unlabeled. 

(b) the urns are labeled. 

2.3. THE PRINCIPLE OF INCLUSION AND EXCLUSION 

A cow has 12 legs, 2 in front, 2 in back, 2 on each side, and 1 in each corner. 

— N. J. Rose 



Suppose/ : A — > A is a function from a set A to itself, i.e., suppose the domain and 
range of / are equal. If A is the set of real numbers, it is not difficult to find func- 
tions like f(x) = e x that are one-to-one but not onto and functions like 
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f(x) = x 3 - x that are onto but not one-to-one. This kind of thing cannot, happen if 
A is finite. Specifically, / G F n „ is one-to-one if and only if it is onto. (The same 
thing cannot be said about functions in 7v„ when m^n. There are P(5, 3) = 60 
one-to-one functions in F^, but F3 5 contains no onto functions at all; there are 
3!S(5, 3) = 150 onto functions in F53, but F53 does not contain a single one-to- 
one function.) 

2.3.1 Definition. A one-to-one function in F„ >n is called a permutation. The sub- 
set of F n}n consisting of the one-to-one (onto) functions is denoted S„. 

Of the n" functions in F Bi „, P(n, n) = n\ are one-to-one, so o(S„) = n\. (The same 
conclusion follows by counting the n\S(n, n) = n\ onto functions in F„„.) Recogniz- 
ing the permutations in F BjB is easy. They are the sequences in which no integer 
occurs twice. 

2.3.2 Example. F 2j2 = {(1, 1), (1, 2), (2, 1), (2, 2)} and S 2 = {(1, 2), (2, 1)}. Of 
the 3 3 = 27 functions in F 33 , only 3! = 6 are permutations: S3 = {(1,2,3), 
(1,3,2), (2,1, 3), (2,3,1), (3,1,2), (3,2,1)}. ' □ 

A fixed point off G F Bj „ is an element i G { 1 , 2, . . . , «} such that/(/) = z. Some 
of the deepest theorems in mathematics involve fixed points. Fixed points of per- 
mutations comprise the foundation of Polya's theory of enumeration (discussed in 
Chapter 3). For the present, we will focus on permutations that have no fixed points. 

2.3.3 Definition. A permutation with no fixed points is called a derangement. 
The number of derangements in S n is denoted D(n). 

There is only one permutation p G Si, and it is completely defined by p(l) = 1. 
Because 1 is a fixed point of p, there are no derangements in Si, i.e., D(l) = 0. 
There is one derangement in S2, namely (2, 1), so D{2) = 1. In S3 (see 
Example 2.3.2), the derangements are (2, 3, 1) and (3, 1, 2), so D(3) = 2. While 
one can tell at a glance whether a sequence represents a permutation, it usually takes 
more than a glance to recognize a derangement. Identification of functions with 
sequences has many advantages, but picking out derangements is not one of them. 

The easiest (and most illuminating) way to evaluate D(n) involves a new idea. 
Let's begin by recalling our discussion of the second counting principle: If A and B 
are disjoint, then o(A U B) = o(A) + o(B). If A and B are not disjoint, then 
o(A U B) < o(A) + o(B), because o(A) + o(B) counts every element of A n B 
twice. (See Fig. 2.3.1.) Compensating for this double counting yields the formula 

o(AUB) = o(A) + o(B)-o(AnB). (2.11) 
What if there are three sets? Then 



o(A UfiUC) = o(A U[BU C]) 

= o(A) + o(B UC)- o(A n [B U C]). 



142 



The Combinatorics of Finite Functions 




Figure 2.3.1 



Applying Equation (2.11) to o(B U C) gives 

o{A UBU C) = o{A) + [o(B) + o(C) — o(B n C)] — o(A n[fiU C]). (2.12) 

Because An(fiUC) = (An8)U(Afl C), we can apply Equation (2.1 1) again to 
obtain 

o(A n [fiu c]) = o(A n b) + o(a n c) - o(a n b n c). (2.13) 

Finally, a combination of Equations (2.12) and (2.13) produces 

o(A UfiUC) = [o(A) + o(B) + o(C)] - [o(A n B) + o(A flC) + o(B n C)] 

+ o(AHBnC). (2.14) 

Adding back o(A nBflC) is, perhaps, the most interesting part of 
Equation (2.14). It seems the subtracted term over compensates for elements that 
belong to all three sets. An element of AHBHC is counted seven times in 
Equation (2.14), the first three times with a plus sign, then three time with a minus 
sign, and then once more with a plus. (See Fig 2.3.2.) 

2.3.4 Example. If A = {1,2,3,4}, B = {3,4,5,6}, and C = {2,4,6,7}, then 
A U B U C = {1, 2, 3, 4, 5, 6, 7}, a set of seven elements. Let's see what 
Equation (2.14) produces. Because o(A) = o(B) = o(C) = 4, 

o(A) + o{B) + o(C) = 12. 

In this case, it just so happens that o(A OB) = o(A flC) = o(B n C) = 2, so 

o(A n B) + o(A n C) + o(B n C) = 6. 

Finally, AnBnC={4}, so o(AnfinC) = l. Substituting these values into 
Equation (2.14) yields o(A U B U C) = 12 - 6 + 1 = 7. 




Don't misunderstand. No one is suggesting that Equation (2.14) is the easiest 
way to solve this problem. The point of the example is merely to confirm that 
Equation (2.14) generates the correct solution! □ 

Let's skip over four sets and go directly to the general case. 

2.3.5 Principle of Inclusion and Exclusion (PIE). If A\,A 2 . . . ,A n are finite 
sets, the cardinality of their union is 

o(\jA)=f2(-V r+1 Nr, (2.15) 



\i=l / r=l 

where 

N r= E°(ri A /w)- ( 2 - 16 ) 

f&Qr,n \<=1 / 

Because/ S Q r n if and only iff is a strictly increasing function, N r is the sum of 
the cardinalities of the intersections of the sets taken rata time. That is, 

n n n 

N X =Y, o(Ai), N 2 = J2 o{A, n Aj) , N 3 =J2 "{A, n Aj n A k ) , 

1 = 1 i,J=l ij*=l 

i<j i<j<k 

and so on. Written out, Equations (2.15)-(2.16) look like this: 



o(Ai U • • • U A n ) = °( A i) ~ Yl °( Ai n A i) + 12 °( A ' n A J n A *) " 

i i<j i<j<k 
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Proof. Let x be a fixed but arbitrary element of A,- U A 2 U • • • U A„. Then x 
belongs to some k of the n sets. Without loss of generality, we may assume that 
x belongs to the first k sets, i.e., x G A,-, 1 < i < k, and x g A;, k < i < n. Let's 
compute the contribution of x to N r . For any / G g r ,n, x G n[ =1 Ay(,) if and only if 
/(r) < A: if and only if / G <2r,i:- Hence, the contribution of x to N r is 
°{Qr,k) = C(k, r), 1 < r < k. So, the contribution of x to the right-hand side of 
Equation (2.15) is 

k k 

£(-i) r+1 c(M = i - E(-i) r c(M 

r=l r=0 
= 1 

(because X)r=o ( — ^YC{k, r) = [—1 + l] k = 0). In other words, the right-hand side 
of Equation (2.15) counts every element of the union exactly once. ■ 

It may seem hard to believe that PIE could ever be useful. In fact, it is exactly the 
right tool for counting problems like the one in Example 2.3.4, where, for 
1 < r < n, "it just so happens" that 




is the same for all / G Q r , n - Let's illustrate with the derangement numbers. If 
A,- = {p G S n : p(i) = i}, 1 < i < n, then A\ U A 2 U • • • U A n is the set of permuta- 
tions having at least one fixed print, so 

D{n) =n\- o(A, U A 2 U • • • U A„). 

Using the Principle of Inclusion and Exclusion, 

D(n) = n\-J2(-l) r+1 N r . (2.17) 

r=l 

To evaluate on the right-hand side of Equation (2.17), let / G Q r , n - Then 
p G Ayp) fl Af(2) n • • • n Aj( r ) if and only if the numbers /(l), /(2) . . . , /(r) are 
all fixed points of p. Because there are no restrictions on how p might permute 
the remaining n — r numbers among themselves, there are exactly (n — r) ! permu- 
tations p G S n that fix/(i), 1 < i < r, i.e., 

o{A m nA m n---r\A nr) ) = {n-r)\, 
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for all f€Q r ,n- It follows that N r = (n - r)\C(n, r) = n\jr\. Thus, from 
Equation (2.17), 



o(»)=»i-E4 — 



, n\ n\ n\ (-1)""! 

1111 (-1)' 

1 1 V - - 

0! 1! 2! 3! n\ 



(2.18) 



Recall that the power series expansion 



«>o 

is absolutely convergent for all x. Setting x = — 1, we obtain the alternating series 

1 _ 1 1 1 1 

e ~ 0! ~ V. + 2! ~ 3! + " ' ' 

By the alternating-series test, the error in the estimate 

1^1 1 1 1 (-1)" 
e _ 0!~T! + 2!~3! + '" + n\ 



is at most l/(n + 1)!- (The notation "=" means "approximately equal".) It follows 
that the error in the estimate 

D(n) = ^ (2.19) 
is at most l/(n + 1), which is enough to prove the following. 

2.3.6 Theorem. The nth derangement number, D{n), is the integer closest to 
n\/e. 

2.3.7 Example. From Equation (2.18), 

D(4)=4!(l-l+I-I + i) 
= 24-24+ 12-4+1 
= 9, 
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whereas 4\/e = 8.8291. Similarly, 

D(5)=5!(l-l+I-i + i- T y 
= 120- 120 + 60-20 + 5 -1 
= 44, 

while 5!/e = 44.1455. (It turns out that D(n) > n\/e if n is even and D(n) < n\/e if 
n is odd.) □ 

How many permutations p G S n have exactly A: fixed points? This is a job for the 
fundamental counting principle. There are C(n, A:) ways to choose the numbers to 
be fixed and D(n — k) ways to derange the remaining n — k "points". So, among 
the nl permutations of S n , C(n,k) x D(n — k) have exactly k fixed points. 

Denote by P(k) the fraction of permutations in S n that have exactly k fixed 
points. If we assume that n is enough larger than k for the estimate D(n — k) = 
(n — k) ! je to be valid, then 

It is proved in Section 3.3 that the average of the numbers of fixed points of the 
permutations in S„ is 1. Setting k = 1 in Equation (2.20) shows that the fraction of 
permutations in S„ that have exactly 1 fixed point is P(l) = 1/e. 

2.3.8 Example. Let F(p) be the number of fixed points of p G 5 3 = {(1, 2, 3), 
(1,3,2), (2, 1,3), (2,3,1), (3, 1,2), (3,2,1)}. Then F(l,2,3)=3, F(l,3,2) = 
F(2, 1,3) = F(3,2, 1) = 1, and F(2,3, 1) = F(3, 1,2) = 0. From these data, it is 
easy to see that the average number of fixed points is [3 + 1 + 1 + 1 + 
0 + 0] /6 = 1 , and easy to confirm that the fraction of permutations in S3 having 
exactly one fixed point is C(3, l)D(2)/6 = | = 0.5. (The estimate 0.5=l/e = 
0.3678794 . . . afforded by Equation (2.20) when n = 3 and k = 1 is evidently not 
very good.) 

It follows from Theorem 2.3.6 that D(9) = 133,496. From Equation (2.20), the 
fraction of permutations in Sw having exactly one fixed point is C(10, 1)D(9)/ 
10! = D(9)/9! = 0.3678792, which compares more favorably with 1/e. □ 

Let's see how the Principle of Inclusion and Exclusion might be used to produce 
new information about Stirling numbers of the second kind. Let A s = 
{/ € F nun : f~ i (s) = 0}, 1 < s < n. Observe that no / G A s can be onto. In 
fact, g G F mfl is onto if and only if 

gG'A 1 UA 2 U---UA n . 



* Then P(k) is the probability that a randomly chosen permutation in S„ has exactly k fixed points. 
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Therefore, 

n\S(m, n) = n m - o{A x U A 2 U • • • U A„) 

= n >n_J2o(A i ) + J2o(A i nA j ) 

i i<j 

- o(A i r\Ajr\A k ) + ---. (2.21) 

i<j<k 

Now, A n is the set of functions in F mfl that do not map anything to n. In fact, it 
would be very easy to confuse A n with F m> „_i. Certainly, o(A n ) = (n — l) m . But, the 
number of functions in F mjr that map nothing to n is the same as the number of 
functions that map nothing to 1 or nothing to 2. In other words, 
o(Ai) = (n — l) m , 1 < i < n. Similarly, there is a one-to-one correspondence 
between the functions in A n (~lA„_i and F m „_2- Thus, o(A„ flA„_i) = (n — 2) m . 
Hence, o(A ; n Aj) = (n - 2) m , 1 < i < j < n. Similarly, o(A,- H Aj n A t ) = 
(n - 3)'", l<i<j<k<n, and so on. Substituting these values into 
Equation (2.21) yields 

n\S(m,n) = n m - n(n - l) m + C(n,2)(n - 2)'" - C(n,3)(« - 3) ffl + • • • 

= g(-irC(n,,)(n-,r- (2-22) 

s=0 

Because C(n,n — t) = C(n,t), replacing s with n — t in Equation (2.22) yields 

«!S(m,«) =^(-l)""'C(»,0? m . 
(=i 

It seems we have done nothing more than rediscover Stirling's identity 
(Corollary 2.2.4)! 

Let's try something else, maybe an example from the intersection of combina- 
torics and number theory. 

2.3.9 Definition. Let n be a positive integer. The Euler totient function cp(n) is 
the number of positive integers m < n such that m and n are relatively prime. 

2.3.10 Example. The positive integers less than 9 and relatively prime to 9 
are 1, 2, 4, 5, 7, and 8, so cp(9) = 6. The first few values of cp(n) appear in 
Fig. 2.3.3. □ 



n 


1 
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6 
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Figure 2.3.3. The Euler totient function. 
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2.3.11 Theorem. Suppose n = p^p^ ■ ■ ■ p r k , where r, > 0, 1 < i < k, and 

p\,P2, ■ ■ ■ ,Pk are distinct primes. Then 



<p(n) =»n^7 

i=\ Pi 



Proof. Let S = {1,2, ... ,n}. Define 

Ai = j/>„ 2pi, 3pi,..., (^) 



1 < i < k. 



Then A, is the subset of S consisting of the multiples of /?, . Moreover (just count its 
elements), o(A,) = njpi. If i ^ j, then A,- n A,- consists of those elements of S that 
are multiples of p, and pj and, therefore, of p(Pj. So, 

A; n Aj = {pipj, 2p iP j, 3pipj, ...,( — )piPj}- 

\PiPjJ 

In particular, for i < j, o(A t n Ay) = n/(piPj). If i <j < k, then o(A ; n Aj H A*) = 
n/ipiPjPk), and so on. 

If 1 < m < « (i.e., if m S 5), then the greatest common divisor of m and w is 
greater than 1 if and only if m and n have a common prime divisor if and only if 
meAiU^U'-UAi. So, 

cp( n ) = n - 0 (Aj U A 2 U • • • U A k ) 

= n - ^ °( A d + X! °( Ai n A >) ~ X! °( A '' n A -> n H 

i i</ i<j<k 

n n \ ( n n \ ( n 

= n — " 



Pi P2 / \PlP2 PlP3 J \PlP2P3 

{E k - E k -i + E k _ 2 + [-l] k E 0 ), 



P1P2 ---Pk 



where E, = E t (p\,p2, ■ ■ ■ ,pt) is the tth elementary symmetric function, 1 < t < k. 
Because {p x - l)(p 2 - 1) ■ ■ ■ (p k - 1) = E k - + E k - 2 h [-l] k E 0 , 

cp(n)= (pi - 1)(>2- 1) ■■■(/>*- 1). ■ 

P1P2 • • - p* 



2.3.12 Example. A favorite number of the Babylonians was 60 = 2 2 x 3 x 5. 
By Theorem 2.3.11, 

9 (60) = 6(/ 2 - lW3 ^ /5 - r 
= 16 



2 



The 16 numbers less than 60 and relatively prime to 60 are 1, 7, 11, 13, 17, 19, 23, 
29, 31, 37, 41, 43, 47, 49, 53, and 59. □ 
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2.3. EXERCISES 

1 List all 24 elements of S4 and underline the nine derangements. 

2 Tabulate the numbers of permutations in 55 that have exactly k fixed points, 
0 < k < 5. (Hint: Be sure the numbers add up to 5!.) 

3 How many elements of S(, have exactly 

(a) two fixed points? (b) three fixed points? 
(c) four fixed points? (d) five fixed points? 

4 If p <G S n , then p~ l G S„ is the unique permutation satisfying p(p~ l (x)) = x = 
p^ 1 (p(x)) for all x € {1,2,..., «}. Prove that p and p~ x have the same number 
of fixed points. 

5 It would seem to follow from Exercise 4 that derangements come in pairs, so 
that D(n) should always be even. Find the fallacy in this argument. 

6 Show that D(10) = 1,334,961. 

7 Compute the number of permutations in S15 that have exactly five fixed points 
and compare it to the approximation 15!/5!e obtained by multiplying both 
sides of Equation (2.20) by n\. 

8 Use AU B U C U D = AU (B U C U D) along with Equations (2. 1 1) and (2. 14) 
to give an independent proof of the n = 4 case of the Principle of Inclusion and 
Exclusion. 

9 Among the math courses offered by Sunrise High School are algebra, geometry, 
and trigonometry. To be in the Math Club, a student must have completed (at 
least) one of these three courses. The Math Club has 56 student members, 
altogether. Of these, 28 have taken algebra and 28 have taken geometry, 11 
have taken both algebra and geometry, 12 have taken both algebra and trig, 
and 13 have taken both geometry and trig. If 5 of the students have taken all 
three courses, how many have taken trigonometry? 

(a) Solve the problem using Venn diagrams. 

(b) Solve the problem using the Principle of Inclusion and Exclusion. 

(c) Which is easier? 

10 Use Stirling's identity (circa Equation (2.22)) to 
(a) compute 5(12,3). 



(b) prove that 2S(m + 1,3) 



yn _ 2 m+l + ^ m > Q. 



(c) prove that 2S(m +1,3) 
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(d) prove or disprove that 3\S(m + 1,4) = det 



2° 
3° 



l 1 

2 1 
3 1 
4 1 



l 2 

2 2 
3 2 
4 2 



l m \ 



V4° 



11 Show that exactly 24,024 of the 40,320 permutations in Sg derange the even 
integers 2, 4, 6, and 8. 

12 Prove that n\ = 1 + ££=2 C(n, jfc)D(ifc). 

13 Prove that 

(a) D(n) = (n- 1) [D(n- 1) +D(n -2)], n > 3. 

(b) D(n) = «£>(« - 1) + (-1)", w > 2. 

(c) D(n + 1) is even if and only if D(n) is odd. 

14 Starting with D(l), the sequence of derangement numbers is 0, 1, 2, 9, 44, ... . 
Continue the sequence through D(10) using 

(a) the recurrence from Exercise 13(a). 

(b) the recurrence from Exercise 13(b). 

(c) Theorem 2.3.6. 

[Hint: Be mindful of Exercise 13(c).] 

15 How many integer solutions of a + b + c + d = 30 satisfy the boundary 
condition that 

(a) a, b, c, and d are nonnegative? 

(b) a, b, c, and d are not less than 4. 

(c) a,b, and c are nonnegative and d > 11. 

(d) 0<a,b,c,d< 10. 

16 In Exercise 9(b), Section 1 .6, one finds that there are 2925 integer solutions to 
a + b + c + d = 30 that satisfy 3 < a, 2 < b, 1 < c, and 0 < d. Use the 
Principle of Inclusion and Exclusion to find the number of integer solutions 
of a + b + c + d = 30 that satisfy 3 < a < 5, 2 < b < 6, 1 < c < 7, and 
0 < cf < 8. (Hint: Among the 2925 solutions from Section 1.6, let A\ be the 
set consisting of those that satisfy, not a > 3, but a > 6; A2 the solutions 
that satisfy b > 7; A 3 the solutions that satisfy c > 8; and A 4 the solutions that 
satisfy d > 9.) 

17 Find the number of compositions of 12 that have three parts none of which is 
larger than 5 

(a) by listing them. 

(b) using the ideas suggested in Exercise 16. (Show your work!) 
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(c) by computing the coefficient of x 12 in (x + x 2 + x 3 + x A + x 5 ) 3 . (Show 
your work!) 

18 There are four primes in the range 1 < p < 10. How many are there in the 
range 10 < p < 100? To find out, let S = {n : 10 < n < 100}. If n G S is 
composite, then n has a prime divisor less than V 100 = 10. So, 
n e A 2 U A 3 U A 5 U A 7 , where A p = {n e S : p is a factor of n}. Thus, the 
number of primes p satisfying 10 < p < 100 is o(S) ~ o(A 2 U A3 U A5 UA7). 
Use the Principle of Inclusion and Exclusion to evaluate this difference. 

19 The positive integer n is "square free" if it is not (exactly) divisible by the 
square of any prime. Show that there are (exactly) 61 square-free positive 
integers less than 100. 

20 Suppose the positive integer divisors of n (all of them, including 1 and «) are 
di,d2, ■ ■ ■ ,d r . It is shown in Section 4.4 (Exercise 21) that n = <p(<fi)+ 
9(^2) + • • • + <p(d r ). Confirm this fact when 

(a) n = 6. (b) n = 12. 

(c) n = 15. (d) n = 60. 

21 Show that 

(a) cp(/?) = p — 1 if p is a prime. 

(b) cp(2") =2"- 1 . 

(c) (p(60) = cp(3)cp(20), (p(60) = 9(5)9(12), and 9(60) = 9(4)9(15), but 
9(60) + 9(6)9(10). 

(d) 9(m«) = 9(m)9(w) whenever m and n are relatively prime. 

22 Euler proved that — 1 is an integer multiple of m whenever m and n are 
relatively prime. Confirm Euler's theorem when 

(a) n = 4 and m = 3. 

(b) m = 3 and m = 4. 

(c) n = 35 and m = 6. 

23 Explain why Euler's theorem (Exercise 22) is a generalization of Fermat's 
"little theorem" (Exercise 11, Section 1.7). 

24 An inversion of the permutation p € S n is an ordered pair (/,_/') such that z < 7 
but/?(0 > p(j). lfq= (4,3, 1,2,5), then «?(1) = 4 > 3 = q(2), so (1,2) is an 
inversion of q; its other inversions of q are (1,3), (1,4), (2,3), and (2,4), so 
inv(g) = 5, where inv(p) denotes the number of inversions of p. 

(a) If p — {i\,i2, ■••,'«) S 5„, define p # = (/„, i„_i, . . . , z'i). Show that 
inv(p)+ inv(p#) = C(n,2). 

(b) Prove that the average number of inversions over p G 5„ is |C(n, 2). 
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25 Let r(n, k) be the number of permutations in S„ that have exactly k inversions 
(see Exercise 24). Prove that 

(a) r(n, k) = r(n, C(n, 2) — k). 

(b) r(n + 1, k) = r{n,k) + r(n, k — 1) H + r(n, k — n). 

26 Let F n be the number of permutations p e S n that satisfy \i — p(i)\ < 1, 
1 < i < n. Prove that F„ is the nth Fibonacci number, n > 1. (The Fibonacci 
sequence is defined by Fq = F\ = 1 and F n+ \ = F„ + F„_i, m > 1.) 

27 Imagine 15 numbered pool balls tumbled, one at a time, onto a pool table in 
some random order while a score keeper records an "event" every time the 
ordinal number of a ball equals its nominal number (i.e., whenever the fcth ball 
to hit the table happens to be the one decorated with number k). The total 
(cardinal) number of events is the score, something between 0 and 15, 
inclusive. Compute the probability that the score is 

(a) 0. (b) 1. (c) 2. 

(d) 3. (e) more than 3. 

28 Of the 635 billion bridge hands (Example 1.2.5), how many contain at least 
one void? (Hint: A void is a missing suit. Let A] be the set of 13-card bridge 
hands that contain no spades, Ai the hands with a heart void, etc.) 

29 Write an algorithm/program to generate and list, in dictionary order, all m\ 
permutations in S m . 

30 Suppose n—p['p2...p r k k , where r,- > 0, 1 < i < k, and pi,P2,---,Pk are 
distinct primes. Prove that 

9(n) = f[p'r l (p i -l). 

i=i 



2.4. DISJOINT CYCLES 

One picture is worth a thousand words. 

— Fred R. Barnard 

Of the many differences between the kinds of functions one studies in calculus 
(e.g., continuous functions, differentiable functions, etc.) and the kinds we have 
been looking at here (e.g., derangements), one of the most striking concerns 
pictures. There haven't been any pictures of these finite functions. 

The graph of function / is a picture of the set G(f) = {(x,/(x)) : x e D}. The 
value of a calculus-type graph lies in the qualitative information that it reveals at a 
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Figure 2.4.1. "X-ray images" of p x = (5,2,6, 1,4,3,7). 



glance. But, consider the permutation p\ = (5,2,6, 1,4,3,7) S 57. A picture of 
G(/>i) = {(1,5), (2,2), (3,6), (4,1), (5,4), (6,3), (7,7)} would consist of seven 
points scattered in the first quadrant of the xy-plane. Such a graph is not without 
value. It would, e.g., make it easy to identify the fixed points of p\ lying, as they 
do, on the line y = x. But, such a graph just does not have the same impact, say, as a 
sweeping parabolic illustration of f(x) — x 2 . On the other hand, why should it? 
Calculus-type pictures are crafted to illustrate calculus-type notions. If we are going 
to draw pictures, they should be designed to reveal combinatorial notions. 

One possibility is the geometric diagram in Fig. 2.4.1a, where the numbers from 
the domain/range of p\ are represented by dots and the assignment p\(f) = j is 
illustrated by a directed arc. Reminiscent of an X-ray image, this diagram reveals 
some unexpected internal structure. The seven numbers are clearly arranged in four 
disjoint cycles, perhaps better illustrated in Fig. 2.4.1/?. The lengths of these cycles 
are 3, 2, 1 and 1 . The cycles of length 1 have a clear interpretation. They represent 
the fixed points of p\. The significance of the larger cycles will become more 
apparent as we proceed. 

Let p2 — (6, 5, 1,3,7, 4, 2) G Sj. As a sequence, p2 looks, qualitatively at least, 
just like p\. However (see Fig. 2.4.2), its cycle structure is quite different. (Before 
going on, check to be sure that you understand how the picture in Fig. 2.4.2 arises 
from the permutation p 2 .) 




Figure 2.4.2. Diagram of p 2 = (6,5, 1,3,7,4,2). 
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Cycle structure turns out to be as important to the theory of permutations as the 
fundamental theorem of arithmetic is to the theory of numbers. When a per- 
mutation is expressed as a sequence, however, this structure is completely 
hidden. What's needed is a notation, specific to permutations, that illuminates cycle 
structure. 

Consider the "4-cycle" of p2 (Fig. 2.4.2), the one that cycles from 1 to 6 to 4 to 3 
and back to 1. One way to represent it is "(1643)", where the numbers 1, 6, 4, and 3 
occur in the same order that they appear in the cycle but with the understanding 
that, when 3 is encountered at the end, the thing to do is cycle back to 1 at the 
beginning. This strategy of cycling back compensates for the fact that while the 
cycle has no beginning and no end, (1643) has both. 

Because each of them represents the same cycle of p2, let's agree to regard 
(1643), (6431), (4316), and (3164) as equivalent. Observe that, while they contain 
the same four integers, (1643) and (1634) are not equivalent. They do not "cycle" 
the numbers in the same order. (See Fig. 2.4.3.) Similarly, the other cycle of p2 can 
be written in any one of the three equivalent ways (257), (572), or (725), but not as 
(275). 

Once we have our hands on the inequivalent cycles, it only remains to put them 
together — literally. The disjoint cycle notation for p2 is obtained by juxtaposing its 
cycles in either order. So, e.g., we may write p2 — (1643) (257) or p2 = (725) 
(3164). Notice that there are many different-looking ways to express p2 in this 
new notation. How many? Since there are four (equivalent) ways to write the 4- 
cycle and three ways to write the 3-cycle, and since either cycle can be written first, 
there must be 4 x 3 x 2 = 24 ways to express p2 in disjoint cycle notation. 

Ifq = (16)(45)(237) G S 7 , then q(l) = 6, q{2) = 3,q(3) = 7, q(4) = 5,q{5) = 
4, 17(6) = 1, and q(l) = 2. In sequence notation, q = (6, 3, 7, 5, 4, 1, 2). (Disregard- 
ing sequence notation, it would surely be easier to organize this information as 
q{\) = 6 & q{6) = l;q{A) = 5 & q(5) = 4; and q(2) = 3, q(3) = 7, &q{l) = 2.) 

A formal definition of cycle structure depends on the following technical result. 

2.4.1 Lemma. Let p G S m and x G { 1 , 2, . . . , m}. Consider the sequence defined 
recursively by X\ = x and x n+ \ = p(x„), n> l.Ifkis the smallest positive integer 
such that Xk+\ G {x\,X2, . . . ,x&}, then xt+i = p{xk) = x\. 




(1643) (1634) 
Figure 2.4.3 

Every integer greater than 1 can be factored uniquely as a product of primes. 
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Proof. If k = 1, then x\ is a fixed point of p, and the proof is complete. If 
= Xi, where k > i > 1, then = and, because /? is one-to-one, 

Xk = Xi-\, contradicting the minimality of k. ■ 

Because x k+ \ = x\, x k+2 = p{x k +i) = p{x{) = x 2 . Similarly, x k+3 = x 3 , x k+4 = 
X4, and so on. The sequence Xi,x 2 , ■ ■ ■ is cyclic with period k. The numbers in 
the sequence just cycle through X\,x 2 , . . . ,x k over and over again. If p — p 2 and 
x = 7, then (see Fig. 2.4.2) the sequence is 

7,2,5,7,2,5,7,2,5,7,... 

If p = p 2 and x = 4, the sequence is 

4,3,1,6,4,3,1,6,4,... 

2.4.2 Definition. Suppose p,qdS m and x S {1, 2, . . . , m}. Let x\ = x and 
x„+\ — p(x n ), n > 1. If A; is the smallest positive integer such that x k+ \ = x\, then 
the cycle of p containing x is 

C p (x) = (xi*2 ■■■**). (2.23) 

The length of Cp(x) is and C p (x) is sometimes called a k-cycle. If i + 1 = j, or if 
i = k and / = 1, the number Xj follows x t in C p (x). If v follows u in C p (x), if and 
only if v follows m in C q (y), 1 < u,v < m, then the cycles C p (x) and C ? (y) are 
equivalent. 

Evidently, C p (x) and C 9 (y) are equivalent if and only if they have the same 
length and contain the same integers in the same (cyclical) order. 

2.4.3 Example. Suppose p = (16)(24)(357) and q = (124)(357)(6). Then 
C p (3) = (357) is equivalent to C p (7) = (735). While C p (l) is also equivalent to 
C q {3) = (357), neither is equivalent to (375) nor to C,(4) = (412). □ 

Let p G S m and j£ {1,2,..., m}. Suppose v G C p (x) = {x\x 2 ■ ■ ■ x k ), i.e., y = x t 
for some / < k. Then 

c p(y) = {xiXi+i ■ ■ -x k x x x 2 ■ ■ -Xi-i) (2.24) 

is equivalent to C p (x). Indeed, C p (x\), C p (x 2 ), . . . , and C p (x k ) are all equivalent to 
each other and they are the only cycles that are equivalent to C p (x). Two 
conclusions follow from this observation. 

2.4.4 Lemma. Suppose p and q are permutations in S m . Ifx,y£ {1,2,..., m}, 
then 

(a) either C p (x) and C p {y) are disjoint or they are equivalent; 

(b) either C p (x) and C q (x) are identical or they are inequivalent. 
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2 5 
p 3 = (1645237) 
Figure 2.4.4 



o 

€> 



O Q 



p 4 = (l)(2) O) (4) (5) (6) (7) 
Figure 2.4.5 



2.4.5 Definition. Suppose p S S m . If C p (x), C p (y), . . . , and C p (z) are the inequi- 
valent cycles of p, then its disjoint cycle factorization is p = C p {x)C p (y) ■ ■ ■C p (z). 



2.4.6 Example. 


The disjoint cycle 


factorization of 


P\ = 


(5,2,6,1,4,3,7) 


is 


(154)(2)(36)(7); 


Pi = 


(6,5, 1,3,7,4,2) 


is 


(1643)(257); 


P3 = 


(6,3,7,5,2,4,1) 


is 


(1645237); 


ft' = 


(7,5,2,6,4,1,3) 


is 


(1732546); 


PA = 


(1,2,3,4,5,6,7) 


is 


(1)(2)(3)(4)(5)(6)(7) 



Diagrams illustrating pj, and p4 can be found in Fig.s 2.4.4 and 2.4.5, respectively. 
A picture for p^ 1 can be obtained from the diagram for p^ just by reversing the 
direction of each arc. □ 

2.4.7 Example. Using disjoint cycle notation, S3 = {(1)(2)(3), (1)(23), 
(12) (3), (123), (132), (13)(2)}. (Compare with Example 2.3.2.) □ 

Apart from equivalence and the order in which the cycles are written, the disjoint 
cycle factorization of p is unique. Without loss of generality, we can always choose, 
if we wish, to write C p (\) first, to begin the second cycle (if there is one) with the 
smallest integer that does not appear in C p (l), to begin the third with the smallest 
integer that does not appear in either of the first two cycles, etc. This convention 
was used in Examples 2.4.6 and 2.4.7. However, we will not use it all the time 
because that would unnecessarily complicate future counting arguments. Another 
informal convention is to treat equivalent cycles as if they were the same, reflecting 
the fact that they represent the same geometric cycle. 

Let's take stock of where we are. Illustrating permutations by means of dots and 
arcs led to the intuitive notion of a cycle, a simple notion that was, nevertheless, 
surprisingly awkward to define formally. Think of that as the price of admission. 
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Having paid the price, let's amuse ourselves by exploring the cycle structure of 
permutations. 

Observe that the lengths of the cycles in the disjoint cycle factorization of (any) 
p <E S m comprise the parts of a partition of m. In Example 2.4.6, the partition of 7 
afforded by p\ is [3, 2, l 2 ]. The partitions coming from P2,P3, and p4 are [4, 3], [7], 
and [l 7 ], respectively. 

2.4.8 Definition. Suppose p S S m . The partition of m whose parts are the lengths 
of the cycles in the disjoint cycle factorization of p is the cycle type of p. Two per- 
mutations of the same cycle type are said to have the same cycle structure. 

This definition suggests two questions: (1) How many different cycle types are 
there? (2) How many different permutations share a specified cycle type? The first 
question is easy to answer. The set S m contains pirn) different cycle types, one for 
each partition of m. The second question is more interesting. 

2.4.9 Example. Consider the permutation px = (1643) (257) S Sj having cycle 
type [4, 3]. We have already observed that (1643) (257) is just one of 24 different- 
looking ways to express p2 in disjoint cycle notation. We now want to consider a 
different question, namely, how many different permutations in Sj have cycle type 
[4,3]? 

Any such permutation can be expressed in the form p = (abcd)(xyz). There are 7 
choices for a, 6 for b, 5 for c, and 4 for d. While P(7, 4) = 840 may give the number 
of ways to fill the 4-cycle, it is not the number of ways to choose the 4-cycle. It is 
too large. It does not take equivalence into account. Since (abed) = (beda) = 
(cdab) = (dabc), the number of different 4-cycles that can be produced using seven 
numbers is P(7,4)/4 = 210. (Don't confuse P(7,4)/4 with P(7,4)/4! = C(7,4).) 

Once a 4-cycle is chosen, three numbers remain to play the roles of x, y, and z. 
These can be arranged in a 3-cycle in P(3, 3)/3 = 2 inequivalent ways, e.g., (xyz) 
or (xzy). By the fundamental counting principle, S 7 must contain 210x2 = 420 
permutations of cycle type [4, 3]. □ 

Note that the 420 permutations enumerated in Example 2.4.9 have the same 
cycle structure as (abc) (wxyz) . Indeed, 

7x6x5x4 3x2x1 7x6x5 4x3x2x1 



2.4.70 Example. How many permutations in S 12 have cycle type [3 2 ,2 3 ] = 
[3, 3, 2, 2, 2]? Solution: The generic permutation of this type is 

p = (abc)(xyz)(ij)(uv)(rs). 

'Lowercase "p" has been used (so far!) for primes, probabilities, polynomials, partitions, and 
permutations. Here p(m) is the number of partitions of m. (Is it any wonder that mathematicians frequently 
resort to other alphabets?) 
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There are P(12, 3)/3 = 440 ways to choose the first 3-cycle. Once it is chosen, 
there are P(9,3)/3 = 168 ways to choose the second. So, 440 x 168 = 73,920 
is the number of ways to choose an ordered sequence of two 3-cycles. Because 
the cycles of a permutation can be arranged in any order, this number double counts 
the pair of 3-cycles, once in the form (abc) (xyz) and again as (xyz) (abc) . Compen- 
sating for this double counting, we see that there are 73, 920/2 = 36, 960 different 
ways to choose the pair of 3-cycles. 

This issue of double counting did not arise in Example 2.4.9 because cycle type 
[4, 3] does not admit two (or more!) cycles of the same length. (Check to see, e.g., 
that (1643) (257) and (257) (1643) did not get counted as different permutations in 
Example 2.4.9.) 

No matter which six numbers occur in the two 3-cycles, six numbers remain to be 
distributed among the three 2-cycles: (if) can be chosen in P(6, 2)/2 = 15 ways ; 
(uv) in P(4, 2)/2 = 6 ways; and (rs) in P(2, 2)/2 = 1 way. So, six numbers will 
produce 15 x 6 = 90 ordered sequences of three 2-cycles. Because an unordered 
collection of three 2-cycles can be arranged in 3! = 6 ways, the same six numbers 
will produce ^2=15 unordered sequences of three 2-cycles. Hence, the number of 
permutations in S u of cycle type [3 2 , 2 3 ] is 36, 960 x 15 = 554, 400. 

Each of the 12! = 479, 001, 600 permutations in Sn has one of p(12) = 77 cycle 
types. Our calculations show that a little over 0.1 1% of the permutations in Sn have 
cycle type [3 2 , 2 3 ]. □ 

2.4.11 Example. Of the permutations in 57, how many have disjoint cycle 
factorizations consisting of exactly three cycles? Solution: The pi(7) — 4 three- 
part partitions of 7 are [5, l 2 ], [4, 2, 1], [3 2 , 1], and [3, 2 2 ]. Given the tools presently 
at our disposal, answering the question evidently requires four computations of 
the type just completed in Example 2.4.10. That's the bad news. The good news 
is that, while S12 contains nearly 500 million permutations, 67 contains only 
7! = 5040. 

Let's start with cycle type [5, l 2 ], corresponding to a permutation of the form 
(abcde)(x)(y). There are P(7,5)/5 = 504 ways to choose the 5-cycle. Once that 
is done, there is only one way to fix the remaining two points. So, exactly 10% 
of the permutations in Sj have cycle type [5, l 2 ]. 

Alternatively, because there are 7 ways to choose the fixed point x, and then 6 
ways to choose y, there are 7 x 6 ways to choose an ordered pair of 1 -cycles. 
Adjusting for the fact that this counts (x)(y) as different from (y)(x) yields 
[7 x 6]/2 = C(7,2) = 21 ways to choose an unordered pair of 1-cycles. After 
the 1-cycles have been chosen, there are P(5, 5)/5 = 5!/5 = 24 ways to choose 
the 5-cycle. This alternative computation leads, of course, to the same answer, i.e., 
21 x 24 = 504 of the permutations in S 7 have cycle type [5, l 2 ]. 



'Because (17) and (ji) are equivalent, one could just as well argue that there are C(6, 2) ways to choose this 
2-cycle. Of course, C(6,2) = iP(6,2). 
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Among the various ways to count the permutations of cycle type [4, 2, 1], having 
the generic form (abcd)(ij)(z), are 

[P(7,4)/4] x [P(3,2)/2] = 210 x 3 = 630, 
[P(7,2)/2] x [P(5,4)/4] = 21 x 30 = 630, 

and/or 

7 x [P(6,4)/4] = 7 x 90 = 630. 

In the first and second alternatives, after the 4-cycle and 2-cycle have been chosen, 
there is only one way to choose the 1 -cycle. In the third case, after the fixed point 
and the 4-cycle are chosen, there is just one way to choose the 2-cycle (because (xy) 
and (yx) are equivalent). 

Next, consider the generic permutation (abc)(xyz)(w) of cycle type [3 2 , 1]. Once 
the 3-cycles have been chosen, there is just one choice for w. So, because an 
unordered pair of 3-cycles can be chosen in 

([P(7,3)/3]x[P(4,3)/3])/2=(70x8)/2 

ways, there are 280 permutations in Sj of cycle type [3 2 , 1]. 

The fourth three-part partition of 7 is [3, 2 2 ]. There are P(7, 3)/3 = 70 ways to 
choose the 3-cycle. Once it has been chosen, there are ([P(4,2)/2] x [P(2, 2)/ 
2])/2 = 6/2 = 3 ways to choose an unordered pair of 2-cycles from the remaining 
four numbers. So, the number of permutations having this cycle type is 70 x 3 = 
210. Alternatively, we could just as well choose the unordered pair of 2-cycles in 

([P(7,2)/2]x[P(5,2)/2])/2=(21xl0)/2 

= 105 

ways, and a 3-cycle from the remaining three numbers in P(3, 3)/3 = 2 ways, for a 
total of 105 x 2 = 210. 

Adding the numbers of each cycle type produces a total of 

504 + 630 + 280 + 210 = 1624 (2.25) 

permutations in S 7 having disjoint cycle factorizations consisting of exactly three 
cycles. □ 

Example 2.4. 1 1 involved a lot of work. If it is going to be important to know how 
many permutations in S m have disjoint cycle factorizations consisting of (exactly) n 
cycles, i.e., faced with the prospect of having to do many problems like the one in 
Example 2.4.11, it would surely be worth the effort to look for an easier way of 
going about it. Such efforts often begin by giving the desired quantity a name. 

2.4.12 Definition. The number of permutations in S m whose disjoint cycle 
factorizations consist of (exactly) n cycles is the Stirling number of the first kind, 
s{m, n). 
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From Equation (2.25), 5(7, 3) = 1624. From Example 2.4.7, s(3, 1) = 2, 
5(3,2) = 3, and i(3,3) = 1. Stirling numbers of the first kind, s(m,n), and their 
relationship to Stirling numbers of the second kind, S(m,n), are the subject of 
the next section. (Be aware that s and S are easily confused, especially when 
they are not printed side by side.) 



2.4. EXERCISES 

1 Draw the "X-ray image" and then write down the disjoint cycle factorization 
of 

(a) p = (2,6,4,5,3,1,7). (b) p = (7,6,5,4,3,2, 1). 

(c) p= (7,6,4,5,3,2,1). (d) p= (4,7,6, 1,3,2,5). 

(e) p= (6,1,4,9,8,2,5,7,3). (f) p = (3,4,5,2,7,8,9,6, 1). 

2 By the reasoning of Example 2.4.10, exactly 15 permutations in S$ have cycle 
type [2 3 ]. Write them all down (using disjoint cycle notation). 

3 Convert from disjoint cycle to sequence notation: 
(a) (123) (45) (67). (b) (135) (246). 

(c) (13) (5) (246). (d) (12) (3) (4) (5). 
(e) (1) (2) (345). (f) (15432). 

4 If p € S m , then p~ l € S m is the unique permutation that satisfies p{p~ l {x)) = 
x = p~ x (p(x)), 1 < x < m. Find the disjoint cycle factorization of p~ x when 

(a) p = (1234). (b) p = (12345). 

(c) p = (123456). (d) p = (15432). 

(e) p = (15)(23)(4). (f) p = (184)(2756)(3). 

(g) p = (1357)(8642). (h) p = (1742)(3586). 

5 Suppose (x\X2 ■ ■ -x k _xx k ) is a cycle in the disjoint cycle factorization of 
permutation p. Show that (x^Xk-x • • • x^x\) is a cycle in the disjoint cycle 
factorization of p~ l (treating equivalent cycles as if they were equal). 

6 Suppose p <G S m . Prove that p and p^ 1 have the same cycle type. (Hint: 
Exercise 5.) 

7 Express all 24 permutation of S4 in disjoint cycle notation. 

8 Write down the seven cycle types that occur among the permutations of S5 . 
(Hint: Example 1.8.6.) 

9 Compute the number of permutations in S5 of each cycle type. 

10 Compute the Stirling numbers of the first kind, s(5,n), 1 < n < 5. (Hint: 
Exercise 9.) 
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11 Show that the number of permutations in Sn of cycle type 
(a) [3 4 ] is 246,400. (b) [4 3 ] is 1,247,400. 

(c) [6 2 ] is 6, 652, 800. (d) [2 6 ] is 10, 395. 



12 How many permutations in S 12 have cycle type 
(a) [4,2 4 ]? (b) [4 2 ,2,1 2 ]? 
(c) [l 12 ]? (d) [12]? 



13 A transposition is a permutation of cycle type [2, l m 2 \. How many permu- 
tations in S m are transpositions. 

14 Show that there are P(m,k)/k permutations in S m of cycle type [k, l m ~ k ]. 

15 Compute ^(7,2). (Hint: Begin with pi(7)-) 

16 Exhibit the cycle types of the derangements in S m for 
(a) m = 4. (b) m = 5. (c) m = 6. 

17 Prove that the total number of different cycle types afforded by derangements 



where [m/2\ is the greatest integer not larger than m/2. 

18 Recall that C p (l) is the cycle of the permutation p that contains the number 1. 
If k is a fixed but arbitrary integer satisfying 1 < k < m, prove that the length 
of C p (l) is k in exactly (m — 1)! of the permutations of S m . 

19 Denote by c,(p) (not to be confused with C p {t)) the number of cycles of length 
t in the disjoint cycle factorization of p € S m . 

(a) Prove that c x (p) + 2c 2 {p) + 3c 3 (p) H h mc m (p) = m. 

(b) Let (ki,kx, . . . ,k m ) be a sequence of nonnegative integers that satisfies 

k\ + 2k2 + 3&3 H + mk m = m. Prove that the number of permutations 

p G S m that satisfy c,(p) = k„ 1 < t < m, is ml/K, where K = 
l k 'k l \2 k ^k 2 \3 h k 3 l---m k "-k m \. 



2.5. STIRLING NUMBERS OF THE FIRST KIND 

The wind had dropped, and the snow, tired of rushing round in circles trying to catch 
itself up, now fluttered gently down until it found a place on which to rest, and some- 
times that place was Pooh's nose and sometimes it wasn't. 



in S m is 



K2J 




n=l 



— A. A. Milne 
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How many permutations p G S m have disjoint cycle factorizations that consist of a 
single cycle? The name of the answer is s(m, 1), a Stirling number of the first kind. 
In this case, the number itself is easy enough to compute. In the generic m-cycle, 
p = (x\X2 ■ ■ ■ x m ), there are m choices for x\, m — 1 choices for X2, ■ ■ ■ , and 1 choice 
for x m . So, there are m\ ways to fill the m-cycle. Taking equivalence into account, 
we obtain 

s(m,l) = — = (m- 1)!. (2.26) 
m 

Strictly speaking, (m — 1) ! is just another name for the answer. However, associated 
with this name is an algorithm for producing an actual number. It would be useful to 
have simple algorithms for producing the remaining Stirling numbers, s(m,n), 
2 < n < m. 

The only partition of m having m parts is [l m ], and the only permutation in S m of 
this cycle type is the one having m fixed points. So, 

s(m, m) = 1. (2.27) 

From Equation (2.26), s(3, 1) = (3 - 1)! = 2 and, from Equation (2.27), 
s(3, 3) = 1. Because 

*(3, 1) + j(3, 2) +s(3, 3) = o(S 3 )= 6, 

5(3,2) = 3, confirming the values found in the last section using Example 2.4.7. 

From Equation (2.25), s(7,3) = 1624, and s(7,2) = 1764 is the answer to 
Exercise 15, Section 2.4. So, we are well on our way to filling in the entries of 
Fig. 2.5.1, a second Stirling triangle, one comprised of Stirling numbers of the first, 
kind. 
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Sometimes called a "signless" Stirling number of the first kind. 
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Having been in similar places before, it is easy to anticipate that the next result 
will be a recurrence for Stirling numbers of the first kind. 

2.5.1 Theorem. If I < n < m, then 



Compare and contrast Theorem 2.5.1 with S(m + l,n) = S(m, n — 1)+ nS(m, n), 
the recurrence for Stirling numbers of the second kind (Theorem 2.1.20). 

Proof of Theorem 2.5.1. Let K be the set of permutations in S m+ \ whose disjoint 
cycle factorizations consist of n cycles. By definition, o(K) =s(m+ 1,«). The 
theorem is proved by showing that o(K\) = s(m,n — 1) and o(K 2 ) = ms(m,n), 
where K x = {p G K : p(m + 1) = m + 1} and K 2 = K\K X = {p G K : p(m + 1) ^ 
m+ 1}. 

Observe that p G K\ if and only if m + 1 is a fixed point of p if and only if 
(m+1) is a cycle of p. Deleting this 1 -cycle from p leaves a permutation 
p' E S m . Since p is the unique permutation in S m+ \ that can be obtained by juxta- 
posing p' and the 1 -cycle (m + 1), p <-> /?' is a one-to-one correspondence between 
and the permutations in 5 m whose disjoint cycle factorizations consist of n — 1 
cycles, i.e., o(K\ ) — s(m,n — 1). 

The evaluation of 0(^2) is similar, except that the correspondence is m-to-one. 
If p G K 2 , then m+1 must lie in a cycle of p of length greater than 1, i.e., 
pirn + 1) = i for some i ^ m+ I. Deleting m+1 from this cycle produces p#, a 
permutation in S m with n cycles in its disjoint cycle factorization. Conversely, for 
any such / G S m , there is a p G K 2 C 5 m+i such that /? # =/: Simply insert m+1 
just before i in the disjoint cycle factorization of /. Because there are m possible 
choices for ;', there must be (exactly) m permutations p G K 2 such that p# =/, 
i.e., 0(^2) = ms(m,n). ■ 

Theorem 2.5.1 makes it easy to fill in the entries of Fig. 2.5.1, a row at a time, 
e.g., 



and so on, resulting eventually in Fig. 2.5.2. 

Stirling numbers of the first kind pop up in a variety of places that are not 
obviously related to the cycle structure of permutations. Recall, e.g., our discussion 
of formulas for the mth-power sum, 1'" + 2 m + • • • + n m , m > 0. We are now able to 
address the case in which m = — 1. 



s{m + 1, n) = s{m, n — 1) + ms(m, n). 



<4,2) 



s(3, 1) +3^(3,2) 

2 + 3x3 = 11, 
.s(3,2) +3^(3,3) 

3 + 3x1=6, 



5(4,3) 
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Figure 2.5.2. Stirling numbers of the first kind, s(m,n). 



2.5.2 Theorem. If n is a positive integer, then the harmonic number 

X>-=*±^>. (2.2S, 

(fc=l 

Proof. The proof is by induction on w. When n = I, Equation (2.28) becomes 
1 = s(2, 2)/l, and we are off to a good start. Using the induction hypothesis, 



^1 1 



, =1 '* tfc n+l 

s(n+l,2) + 1 



w! ra + 1 

_ (n+ l)s(n+ 1,2) +n! 
(n+l)! 

_ s(n+ 1, !) + («+ l)s(n + 1,2) 
(n+l)! 

_ s(n + 2, 2) 
(n+l)! ' 

by Equation (2.26) and Theorem 2.5.1. ■ 

2.5.3 Example. From Fig. 2.5.2, s(6,2) = 274. By Equation (2.28), s(6,2) is 
equal to 

,, 1 1 1 1\ 120 120 120 120 

5 H 1 1 h- =120H 1 1 1 

1 2 3 4 5/ 2 3 4 5 

= 120 + 60 + 40 + 30 + 24 

= 274. □ 
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From Equations (2.26) and (2.28), 

s{n,2) _ y^l 

Because the harmonic series diverges, taking limits of both sides yields 

.. s(n,2) 
hm — ; — — = oo. 
n^oc s{n, 1) 

Hence, despite the rapid growth of s(n, 1) = (n — 1)!, the ratio s(n,2)/s(n, 1) is 
still unbounded. 

We now have three ways to evaluate, say, s(21,2). The original brute-force 
approach is to count the permutations in S21 whose disjoint cycle factorizations 
consist of two cycles. As illustrated in Example 2.4.11, this might be done by sum- 
ming the numbers of permutations having cycle types [20, 1], [19,2], [18,3], . . . , 
[11, 10]. A second option is to use Equation (2.28) and compute 

ff(21,2)=20!(l+i + i+--- + i). 

A third is to use Theorem 2.5.1, a method that requires us first to compute s(20, 2) 
(not to mention s(20, 1) = 19!). None of these methods seems particularly easy. 
Let's try another approach. Define 

m 

8m(x) =^2s(m 7 n)x" 

= s(m, l)x + s(m, 2)x 2 H h s(m, m)x m . (2.29) 

Superficially, the generating function g m (x) is just a fancy way to display the 
numbers s(m, n), 1 < n < m. On the other hand, this perspective hints at the possi- 
bility of using facts about polynomials to shed some light on the coefficients of 
g m (x). Let's have a look at the first few of these polynomials. From Fig. 2.5.2, 

gi (x) = x 

gi{x) = x + x 2 = x{\ + x) = x(x + 1) 

g 3 (x) = 2x + 3x 2 +x 3 =x(2 + 3x + x 2 ) =x(x+ l)(x + 2). 

Already, a pattern seems to be emerging. Observe that 

x(x+ l)(x+2)(x+3) =g 3 (x)(x + 3) 

= (2x+ 3x 2 +x 3 )(3 +x) 
= 6x+ llx 2 + 6x 3 +x 4 , 

which, by Fig. 2.5.2, is precisely gi{x). 
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2.5.4 Theorem. If g,„(x) = J2n=i s ( m > n ) x "' tnen > for all m > 1, 

g m (x) =x(x+ l)(x + 2) •••(* + m - 1). (2.30) 

Proof. The examples preceding the statement of Theorem 2.5.4 suffice to start an 
induction on m. From 

gm{x) = s(m, l)x + s(m, 2)x 2 + • • • + s(m, m)x m , 

we obtain 

xg m (x) = s(m, l)x 2 + • • • + s(m,n - l)x" + • • • + s(m,m - l)x m + s(m,m)x m+1 , 
mg m (x) — ms(m, l)x + • • • + ms(m, n)x" + • • • + ms(m, m)x m . 

Adding these two equations produces the identity 

(x + m)g m {x) = ms(m, l)x + ■■■ + [s(m, n — 1) + ms(m, n)]x" + ••• 
+ [s(m, m — 1) + ms(m, m)]x m + s(m, m)x m+1 . 

From Equation (2.26), ms(m, 1) — m(m — 1)! — ml — s(m + 1, 1); from Theorem 
2.5.1, s(m, n — 1) + ms(m, n) = s(m + 1, n), 2 < n < m; and from Equation 
(2.27), s(m,m) = 1 = s(m + l,m+ 1). Hence, this last identity is equivalent to 
(x + m) g m (x) = g m+ i(x). ■ 

2.5.5 Corollary. Stirling numbers of the first kind are given in terms of 
elementary symmetric functions by means of the identity 

s(m, n) = E m -„(1,2, . . . ,m — 1), m > n > 1. (2-31) 
Proof. Recall that 

(x - ai)(x -a 2 )---(x- a m ) = x™ - Eyx" 1 ^ + /.>v'" ' + (-1)"X, (2.32) 

where E r = E r (a 1 ,a2, . . . ,a m ) is the rth elementary symmetric function. Substitut- 
ing a,- = 1 — i, 1 < i < m, in Equation (2.32) yields 

m 

x(x+ l)(x + 2) ••• (x + m- 1) = ^(-l) r £ r (0, -1,-2, 1 - m)x m ~ r . 

Together with Theorem 2.5.4 and the fact that 

(-l) r E r (0, -1, -2, . . . , 1 - m) = (-l) r £ r (-l, -2, . . . , 1 - m) 

= E r (l,2,...,m- 1), 

we see from this identity that 

gm(x) = s(m, m)x m + s(m, m — l)x™ _1 + • • • + s{m, l)x 
= Eqx" 1 + E\x m 1 + • • • + E m -\x + E m , 
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where, this time, E r — E r (l,2, . . . ,m — 1). To complete the proof, it remains to 
compare s(m,n), the coefficient to x" in the first of these expressions, with 
E m - n (l, 2, ... ,m — 1), the coefficient of x" in the second (and to observe that 
E m (l,2,...,m-1) = 0). ■ 

The elementary numbers e(n 7 r) = E r (l,2, ...,«) appeared in Section 1.9. By 
Corollary 2.5.5, s(m, n) = e(m — l,m — n). (Confirm this identity by comparing 
Fig. 2.5.2 with Fig. 1.9.2.) 

2.5.6 Example. Because 1 x 2 x • • • x (k - l)(k + 1) x • • • x m = ml/k, it 
follows from Corollary 2.5.5 that 

s(m + 1,2) = E m _i(l,2, . . . ,m) 

m i 

k=\ K 



m y 



giving another proof of Theorem 2.5.2. 
Let 



□ 



f m (x) = x m - s(m,m- l)^- 1 + s(m,m- 2)x™- 2 



+ (-l) m -^(m,l)x. 



(2.33) 



Then f m (x) can be obtained from g m (x) by alternating the signs of its coefficients. 
Hence, from Equations (2.29) and (2.30) (or Equation (2.32)), 



f m (x) = x(x - X)(x - 2) ■ ■ ■ (x — m + 1) 



(2.34) 



the falling factorial function. As will be seen in Theorem 2.5.8 (below), this obser- 
vation has some surprising consequences. 

2.5.7 Example. Consider the 5x5 matrix F$ whose (ij) -entry is the Stirling 
number of the first Kind, s(i,j), 1 < i, j < 5, where s(i,j) = 0 if i < j. From 
Fig. 2.5.2, 



/ 1 0 0 0 0\ 

110 0 0 

F 5 = 2 3 10 0 

6 11 6 1 0 

V 24 50 35 10 1 J 
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This is another example of a matrix that is clearly invertible. (Its determinant is 1.) 
The last time we found ourselves in such a situation we were looking at the Pascal 
matrix C„ = (C(i,j)). In that context, C" 1 was found by sprinkling minus signs 
among the entries of C„. Might the same trick work again? Could 



/ 1 
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0 


o\ 


-1 
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-3 
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0 
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-6 


11 


-6 
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0 


\ 24 


-50 


35 


-10 


1/ 



be the inverse of F5? Check it out. Before reading on, convince yourself that 
F5T ^ 75, the 5 x 5 identity matrix. 

Okay, inverting F5 is not as easy as alternating minus signs among its entries. 
Matrix Y is not the inverse of F$; it is the inverse of 75 = (S(i,j)), the 5x5 matrix 
whose (ij')-entry is a Stirling number of the second kind! From Fig. 2.1.2, 



T 5 = 
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\ 24 


-50 


35 


-10 


v 



= / 5 . (2.35) 



Recall that elementary row operations can be achieved via multiplication on the 
left by an elementary matrix. Thus, e.g., the effect of premultiplying an n x n 
matrix A = (a iy ) by the diagonal matrix diag (— 1, 1, 1, 1, . . . , 1) is to change the 
sign of every entry in its first row. The result of premultiplying A by the 
diagonal matrix 



in which the n diagonal entries alternate between —1 and +1, is to change the signs 
of the entries of A that lie in odd-numbered rows. Similarly, the (/J)-entry of AD„ is 



and 



□ 



D„ = diag(-1, 1,-1,1, -1,... ,(-!)"), 




fly if j is even, 
—fly if j is odd. 
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Pre- and postmultiplying A by D n sprinkles a checkerboard pattern of alternating 
minus signs among its entries — precisely the way Y = T^ 1 is obtained from F5, 
i.e., Y — D5F5D5. Moreover, because = I n , D n is its own inverse. Thus, 
D n F„D n = T~ l if and only if D n F~ l D n = T n if and only if F~ l = D n T n D n . 
Let's illustrate this last point for n = 5. From Equation (2.35), 

/ 5 = T 5 Y 

= T 5 (D 5 F 5 D 5 ), 

proving that D5F5D5 = T^ 1 . Observe that 



F 5 (D 5 T 5 D 5 ) 



confirming that D^T 5 D^ 
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T 1 - 



2.5.8 Theorem. Let F n = (s(i,j)) and T n = (S(i,j)) be n x n matrices of 
Stirling numbers of the first and second kinds, respectively. Let D„ be the n x n 
diagonal matrix whose (i,i)-entry is (—1)', 1 < i < n. Then T~ l = D n F n D n and 
F" 1 = D n T n D n . 



Proof. From the remarks preceding the statement of Theorem 2.5.8, its two 
conclusions are equivalent. So, it will suffice to prove the first, namely, that 
T n Y n = /„, where Y„ = D n F„D n . This, in turn, is equivalent to proving that 

n 

J2(-l) k+j S(i, k)s(kj) = 5, y , 1 < i,j < n. (2.36) 

Because s(i, n) = 0, 1 < i < n, the only nonzero entry in the last column of Y„ is 
s(n,n) = 1. Similarly, because S(i,n) = 0, 1 < i < n, the only nonzero entry in the 
last column of T n is S(n,n) = 1. From these observations, we draw two 
conclusions. First, the last column of T n Y n is equal to the last column of /„, which 
establishes the j — n case of Equation (2.36). Second, the leading (n — 1) x (n — 1) 
principal submatrix of T n Y n is T n -\Y n -\. 

If follows from the second of these conclusions that Equation (2.35) establishes 
the theorem, not only for n = 5, but for n = 1,2, 3, and 4 as well. It is a conse- 
quence of both conclusions that, to complete a proof by induction on n, all one 
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needs do is prove that the entries in the first n — 1 columns of the nth row of T n Y n 
are all zero. In other words, it suffices to prove that 

n 

J2(-l) k+i S(n,k)s(k,j) = 0, 1 <j < n. (2.37) 

k=\ 

Replacing m with k in Equations (2.33) and (2.34) gives 

= • • • + {-l) k+j s{k,j)x? + ■■■ . (2.38) 
Multiplying both sides of Equation (2.38) by S(n, k) and summing on k, we obtain 

£s(«,*)*W - • • • + (£(-l) k+J S(n,k)s(kjy\* + ■■■ (2.39) 

k=l \k=l J 

= x", (2.40) 

by Theorem 2.2.2. Comparing coefficients of x* on the right-hand sides of 
Equations (2.39) and (2.40) produces 

J2(-l) k+j S(n,k)s(k,j) = 0, (2.41) 

l<j<n. m 

2.5.9 Example. Let's confirm Equation (2.41) when n = 6 and j = 2. From 
row 6 of Fig. 2.1.2, the Stirling numbers of the second kind, 5(6, k), 1 < k < 6, 
are 1, 31, 90, 65, 15, and 1. From the second column of Fig. 2.5.2, the Stirling num- 
bers of the first kind, s(k,2), 1 < k < 6, are 0, 1, 3, 11, 50, and 274. Substituting 
these values into Equation (2.41) gives 

-5(6, 1)^(1, 2) + 5(6, 2)^(2, 2) - 5(6, 3)s(3, 2) + • • • + 5(6, 6)j(6, 2) 

= -1 x 0 + 31 x 1 - 90 x 3 + 65 x 11 - 15 x 50 + 1 x 274 

= 0. □ 



2.5. EXERCISES 

1 From Fig. 2.5.2, s(4, 2) = 11. Exhibit the 11 permutations in 5 4 whose disjoint 
cycle factorizations consist of exactly two cycles. 

2 Compute s(7,2) using Equation (2.28). (Hint: Example 2.5.3.) 

3 Confirm that s(6, 3) = 225 by showing 

(a) that p 3 (6) = 3. 

(b) that the three-part partitions of 6 are the cycle types of 15, 90, and 120 
permutations in 5e- 
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4 Using the approach outlined in Exercise 3, confirm 
(a) that s(7,4) = 735. (b) that s(8, 3) = 13, 132. 

5 Using a method of your choice, compute 

(a) s(8,n), 1 < n < 8. (b) s{9,n), l<n<9. 

6 Show that E r (l,2,.. . , k) = s(k + l,k + 1 - r). 

7 Prove that s(m,m — 1) = C(m, 2). 

8 Fill in the blanks (using actual numbers): 

(a) x (5) = x 5 - x 4 + x 3 - x 2 + x- . 

(b) x 5 = x< 5 ' + x< 4 ' + x< 3 ' + x< 2 ' + x+ . 

9 Compute 

(a) £ , 5_ r (l, 2, 3, 4) and confirm that the answer is ,y(5, r), 1 < r < 5. 

(b) £ , 6_„(1,2,3,4, 5) and confirm that the answer is s(6, n), 1 < n < 6. 

10 Show that £Li(- 1 ) ,+ * 5 '(*'> k)s(k,j) = S u , 1 < i,j < n. 

11 Prove that 

(a) m\ = s(m, 1) + s{m, 2) + • • • + s(m, m). 

(b) n\ = s(n, n)n n -s(n,n- l)n"-' + s(n, n - 2)n"- 2 + (~l) n ^s(n, l)n. 

12 Prove that s(m, 1) — s(m, 2) + s(m, 3) — s(m, 4) + • • • — (— l) m s(m, m) = 0, 
m > 1. (Compare with Lemma 1.5.8.) 

13 Base a new proof of Corollary 2.5.5 on Lemma 1.9.8 and Theorem 2.5.1. 

14 Prove that s(m + l,n+ 1) = Yl=tX m ~ k)\C(m,k)s(k,n). 

15 Prove the following analog of Exercise 13, Section 2.1: 

m 

s(m + l,n+ 1) = s(m, k)C(k, n). 

16 Prove that 

i 

Y^-\f +1 S{i+ l,k+ l)s(kj) = C(i,j). 

k=j 

(Hint: Exercise 13, Section 2.1.) 

17 Let g(n) (not to be confused with g m (x)) be some function of n. Suppose / is 
another function, defined in terms of g by 

(a) f(m) = J2n=i S( m , n )g( n )> w i m a big S. Prove that g(m) = 
Y^=\(~l) m+ " s ( m ' n )f( n )> w i m a sma ll 

(b) f(m) = J2n=i s ( m ' n )s{ n )> with a small s. Prove that g{m) = 
E:=i(-l) m+ "5(m, «)/(«), with a big S. 
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18 Equation (2.9) in Section 2.2 suggests a role for Stirling numbers of the second 
kind in evaluating the sum of the mth powers of the first n positive integers. 
Explain how Stirling numbers of the first kind might be used to evaluate this 
same mth-power sum. {Hint: Exercise 12, Section 1.9.) 

19 Prove that J2n=i(~ 's(m,n)B„ = 1, where B n is the rath Bell number. 

20 Confirm the identity in Exercise 19 when 
(a) m = 4. (b) m = 5. 

21 If p is an odd prime, then p is a factor of s(p, r), 1 < r < p. 

(a) Confirm this result when p = 7. 

(b) Show that this result need not remain true if "prime" is replaced with 
"composite integer". 

22 Suppose 1 < n < m. Generalize Theorem 2.5.2 by showing that 

s(m,n) = (m-l)\ £ jfj/W -1 . 

(Hint: Corollary 2.5.5.) 

23 Use the formula from Exercise 22 to evaluate 
(a) S (4,2). (b) S (4,3). 

(c) 5(5,2). (d) 5(5,3). 

24 It can be shown that 

I n 

s(m,n)=—J2l[ r 7\ 

t=i 

where the summation is over all compositions r\ + ri + ■ ■ ■ + r n = m having 
n parts. Use this formula to evaluate 

(a) 5(4,2). (b) 5(4,3). 

(c) 5(5,2). (d) 5(5,3). 

25 Confirm the equation 

^ 1__ f s{m+ 1,2) \ 2 s(m+ 1,3) 
^ t 2 \ ml J ml 

(a) for m = 1. (b) for m = 2. 
(c) for m = 3. (d) for m = 4. 

26 Show that s(m + l,m — 2) = j%m(m + l)[3.s(m + 1, m — 1) — m 3 + m\. 
(Hint: Exercise 23, Section 1.9.) 

27 Write an algorithm/program to compute s(m,n), 1 < m < 10, 1 < n < m. 
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28 Suppose n and k are positive integers. Let A be the k x k matrix whose (z'j)- 
entry is s(n + i,j). Prove that det(A) = (n\) k . (Hint: Using appropriate 
elementary row operations, show that det(A) = det(U), where U is a k x k 
upper triangular matrix each of whose diagonal entries is «!.) 

29 If A and B are r x r and s x s matrices, respectively, their direct sum A 0 B is 
the (r + s) x (r + s) matrix 

'A 0* 
0 5 



AffiB: 



(a) Show that the (n + 1) x (n + 1) matrix of Stirling numbers of the first 
kind, 

f n +i = (h ®F n )C[ 0 ,„], 

where 

/C(0,0) C(0,1) ••• C(0,»)\ 
C(1,0) C(l,l) ••• C(l,«) 



c 



[0,n] 



\C(n,0) C(n, 1) 



c(«,«)/ 



is the («+1)x(m+1) generalized Pascal matrix of Exercise 25, 
Section 1.5, and I\ is the 1 x 1 identity matrix. 

(b) Confirm part (a) when n = 4. 

(c) Show that F A = (I 2 0 C m ) x (h 0 C m ) x C [0 , 3] . 

(d) Suggest a factorization for F„ in terms of direct sums of identity 
matrices and matrices of binomial coefficients. 

(e) Prove or disprove your suggested generalization. 

30 Confirm Theorem 2.5.8 when n = 6, i.e., show that 

(a) T 6 D 6 F 6 D 6 = I 6 . 

(b) D 6 T 6 D 6 F 6 = I 6 . 

31 Using the notation from Exercise 29, 

(a) Show that the (n + 1) x (n + 1) matrix of Stirling numbers of the second 
kind 

T n +\ = C[o,„](/i ®T n ). 

(b) Confirm part (a) when n = 4. 

(c) Show that T 4 = C m x (7j 0 C [0 , 2 ]) x (7 2 0 C [0 ,i]). 

(d) Suggest a factorization for T n in terms of direct sums of identity matrices 
and matrices of binomial coefficients. 

(e) Prove or disprove your suggested generalization. 
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Man is but a reed, the most feeble thing in nature; but he is a thinking reed. 

— Blaise Pascal (Pensees) 

Chapter 3-6 are completely independent of each other. Following Chapters 1 and 2, 
the final four can be read in any order. 

The topics of Chapter 3 are deeper than the second stratum, in part because they 
involve compositions of functions. Basic definitions of permutation groups are 
introduced in Sections 3.1 and 3.2. While there may be some overlap of this mate- 
rial with abstract algebra, the perspective is different. Section 3.3, e.g., contains a 
lovely characterization of multiple transitivity in terms of Bell numbers. Because 
transitivity is not an invariant of abstract groups, themes like this are unlikely to 
receive the same emphasis in an algebra course. 

Burnside's lemma from Section 3.3 and symmetry groups from Section 3.4 are 
used in Section 3.5 to count color patterns. The finer enumeration of color patterns 
by weight using Polya's pattern inventory is found in Section 3.6. Because it is a 
symmetric polynomial, the pattern inventory is a polynomial in the power sums 
(Theorem 1.9.11). This cycle index polynomial is the subject of Section 3.7. 

There are many natural places to exit from Chapter 3, e.g., at the ends of 
Sections 3.3, 3.5, or 3.6, or immediately after the statement of Theorem 3.6.5. 

3.1. FUNCTION COMPOSITION 

Let / : D — > R be a function. While there is general agreement that D should be 
called the domain of f not everyone concurs that range is the proper name for 
R; some authors use "range" to denote the set {/(*) : x € D}. 

3.1.1 Definition. Let / : D — > R be a function. The image of f is the set 

f(D) = {f(x) : x € D}, sometimes denoted image(/). 
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Note that image(/) =f(D) C R, with equality if and only if / is onto. If 
/ G F,„ „, then /(D) is the set of numbers that appear in the sequence 



Suppose/ : D — > R and g : A — > B are functions. If /(D) C A, then the composi- 
tion of g and/is the function g of : D — ► B defined by g of(x) — g(f(x)). (In cal- 
culus, the derivative of a composition of functions is described by the chain rule.) 

There is an awkward "backwardness" about the standard notation for function 
composition. It is occasioned by the fact that we read from left to right but evaluate 
a composition from right to left: The rule of assignment g of is determined by first 
applying / and then applying g. The French school has eliminated the difficulty by 
putting the function on the right, i.e., writing xf rather than/(x). In the French 
scheme, cumbersome expressions like gof(x) and g(f(x)) become xfg. Because 
this right-handed notation has not been widely accepted in the United States, we 
will stick with the familiar /(x). 

3.1.2 Example. Iff <G ^2,5 ar >d g S ^5,3, where might g of be found? Because/ 
is applied first, gof shares the domain of /. Because g is applied second, 
image(g of) c image(g); so g o f shares the range of g. Therefore, go/e F23. 
To take a specific example, let/ = (3, 4) G ^2,5 and g = (3, 3, 2, 1, 3) G F53. Then 



What about / o gl Because that little circle looks like multiplication, one might 
be tempted to conclude that g°/=/°g. Let's check it out. Observe that 
/og(l)= /(g(l)) =/(3). Given that /=(3,4), what is /(3)? (Don't say 
/(3) = 4. This is no time to confuse sequences with cycles. The cycle idea is valid 
only in the context of permutations. While / G 7*2,5 may be one-to-one, it most cer- 
tainly is not onto.) Because 3 ^ {1, 2}, the domain of/ "/(3)" is nonsense; there is 
no third component in the sequence (3,4) = if {I), /(2)). Since /(3) doesn't exist, 
fog doesn't exist either. In other words, it doesn't make sense even to write/ o g, 
much less expect that it should equal g of = (2, 1). □ 

3.7.3 Example. Suppose/ = (3, 2, 1, 1, 2) G ^5,3 and g = (2, 1, 1) G F 3>2 . Then 
image(/) = range(/) = { 1, 2, 3} = domain(g), so there is a function gof e F52. 
To determine which function it is requires a little work: 



(/(l),/(2),...,/(m)). 



* =*(/(!)) = *(3) = 2, 

go/(2)=g(/(2)) = g(4) = l, 



sog°/= (2, 1). 



g°f(2) 
«"/(3) 
*°/(4) 
g°/(5) 



*(/(!)) 

g(fm) 

*(/(3)) 
g(/(4)) 
g(f(5)) 



*(3) 
*(2) 

8(2) 



1, 
1, 
2. 
2. 
1, 



3.1. Function Composition 



177 



so go/= (1,1,2,2,1). What about fog? This time image(g) = {1, 2} C 
{1,2,3,4,5} = domain(/), so/o g is a legitimate function. Maybe now/o g = 
g of? Let's see. The domain of/ o g is domain(g) = {1,2, 3}; 

=/(*(!)) =/(2) = 2, 
/og(2)=/(g(2))=/(l) = 3, 
/°g(3)=/(g(3))=/(l) = 3, 

so / o g = (2, 3, 3) € F33, which is not hard to distinguish from g 0/ = 
(1,1,2,2,1)6% ' □ 

What is the easy way to compute function compositions? Unfortunately, there 
are no shortcuts. With a little experience, one can find gof without taking up so 
much space, but only by keeping track of all the steps in one's head. Give it a try. 
Let/,g £ F^4 be defined by / = (1, 1,2, 2) and g = (4, 3, 1, 1). If you can, confirm 
in your head that gof = (4, 4, 3, 3) and/ o g = (2, 2, 1, 1). If you can't manage to 
do it in your head, that's not a problem, provided you work it out with pencil and 
paper! 

What about composing three functions? The only really good news here is that 
function composition is associative. If the domains and images match up so that 
f o (go h) makes sense, then [f o g) o h also makes sense, and 

f°{g°h)={fog)oh. (3.1) 

This is useful for two reasons. It means / o go his unambiguous, and it means that 
f o go h can be evaluated, one composition at a time. 

Suppose / S F mm is a permutation. Then / S S m is one-to-one (and onto). So,/ 
has an inverse. It might be helpful at this point to recall the definition of "inverse". 

3.1.4. Definition. Suppose / : D — > R and g : R — > D are functions. Then g is 
the inverse of /if 

g of(d) = d for every d £ D, (3.2) 

and 

/ 0 g( r ) = r f° r every r £ R. (3.3) 

If / has an inverse, then its rule of assignment is uniquely determined by 
Equation (3.2). In other words, if /has an inverse, it is unique. The inverse off 
is typically written, not g, but/ -1 . Two things about this notation deserve comment. 
The first is that/ -1 is just a name for the unique function g that, along with/ satis- 
fies Equations (3.2) and (3.3). The second is that Equations (3.2) and (3.3) are 
symmetric, i.e.,/ -1 = g if and only if = /. (In particular, [/ -1 ]~ =/•) 
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From this point on, our primary interest will be in the composition of 

permutations. 

3.1.5 Example. Focusing on permutations does not affect function composition, 
but disjoint cycle notation changes the way it looks! If p\ — (1473)(2)(56) and 
p 2 = (167)(24)(35), then 

Pi °P2 = (1473)(2)(56) o (167)(24)(35) (3.4) 
= (15)(274)(36), (3.5) 

and 

Pi op, = (167)(24)(35) o (1473)(2)(56) 
= (124)(36)(57). 

There is a purely mechanical way to produce the disjoint cycle factorization of 
p\ o p 2 . Write "(1"- Then place your finger at the right-hand end of Equation (3.4) 
and start moving it to the left, searching for the number 1 . When your finger comes 
to 1, stop. The number immediately to the right of 1 is/? 2 (l) = 6. (So far, so good: 
Pi ° Pi(l) = P\(pi(l)) =/?i(6). It remains to find pi(6).) Resume the leftward 
motion of your finger, but with a new objective. Instead of searching for 1, look 
for (another occurrence of) 6. When you come to 6, stop. (Having already deter- 
mined that /?2(1) =6, we are about to find pi(6).) Because 6 is the last number 
in its cycle, move your finger leftward to the first number of that same cycle. In 
this case, that number is 5. Write 5 next to 1 in "(1", obtaining "(15". 

Now, return your finger to the far right-hand end of Equation (3.4) and repeat the 
process, this time beginning your search with 5. Because 5 is the first number 
encountered, the search is brief. As 5 is at the end of its cycle, move your finger 
to the 3 at the beginning of the (same) cycle. (You have just determined that 
f>2(5) = 3. The next step is to determine pi(3).) Without writing anything down, 
resume your leftward movement, looking for the next occurrence of 3. Since it is 
found at the end of its cycle, move your finger to the front of that same cycle, bring- 
ing it to rest on 1. Evidently, 1 = p\(3) = pi(p2(5)). In the disjoint cycle factoriza- 
tion of p\ o p 2 , 1 follows 5. Since we opened the cycle with 1, it is time to close the 
cycle, i.e., change "(15" to "(15)". 

Next, find the smallest number that has not yet been used. In this case it is 2. 
Replace "(15)" with "(15) (2". Place your finger at the far right-hand end of 
Equation (3.4) and repeat the process, searching for 2. Continue in this way until 
you've obtained Equation (3.5). □ 

3.1.6 Definition. Let e m S S m be the function defined by e m (f) = i,l < i < m. 
The permutation e m is called the identity of S m . In disjoint cycle notation, 
e B =(l)(2)---(in). 
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Before reading on, convince yourself that 

f°e m =f = e m of (3.6) 

for every / G S m . A more significant application of Definition 3.1.6 is the following 
useful alternative to the definition of inverse, one that is special to permutations. 

3.1.7 Theorem. Suppose/, g G S m . Then g = f~ l if and only if g of = e m and 
f°g = em- 
Proof. This is just a restatement of Definition 3.1.4 using e m . ■ 

We now come to an important technical observation. 

3.1.8 Lemma. If p, q G S m , then, while they may not be equal, both p o q and 
qop exist, and both are permutations in S m . 

Proof. Because S m C F mM , both po q and qop exist as functions in F mm . It 
remains to prove that they are permutations. By definition, S m consists of those 
functions / G F m , m that are one-to-one (and onto), i.e., S m consists (precisely) of 
the invertible functions in F„ hm . It follows from =/ that the inverse of 

an invertible function is invertible, so p -1 ,^ 1 G S m . To see that qop is invertible, 
observe that 

{qop) o (p- 1 oq' 1 ) = qo {pop' 1 ) o q~ l 
= qoe m oq~ x 
= qoq~ l 

— &m 

by associativity, Theorem, 3.1.7, and Equation (3.6). The identity (p^ 1 o q~ 1 )o 
(qop) = e m can be proved similarly. Thus, by Theorem 3.1.7, 

p- l °q- l = {q°pT\ (3.7) 

the inverse of qo p. In particular, qop has an inverse, which is the criterion that 
must be met to guarantee that q o p G S m . Interchanging p and q in Equation (3.7) 
yields (p o q)~ l = q~ l o p~ l , proving that p o q G S m . ■ 

3.1.9 Example. Let/? = (1524) (3) and q = (143)(25). Then/?" 1 =(4251) (3) = 
(1425) (3) and q~ l = (341) (52) = (134) (25). Let's confirm Equation (3.7) by 
comparing p~ l o q~ l with (q o p)~ l . Observe that 



p ~ l o q- 1 = (1425) (3) o (134)(25) 
= (B2)(4)(5). 
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Next, compute 

qop= (143)(25) o (1524)(3) 
= (123)(4)(5), 

from which it follows that (q o P y l = (321)(4)(5) = (132)(4)(5). □ 

One interpretation of Lemma 3.1.8 is that function composition is a binary 
operation on the set S m . In a calculus course, one must contend with a variety of 
operations. There, it is important to distinguish the composition of two functions 
from their product (e.g., the chain rule from the product rule) and the inverse 
from the reciprocal (i.e.,/ -1 from 1 //). In the context of S m , however, composition 
is the only binary operation that we will be discussing. This leads to several more 
"abuses of the language." For one thing, it can do no harm to drop the little circle 
from the notation for function composition. 

3.1.10 Convention. If /, g <G S m , then gf = gof, i.e., the composition of g and/ 
may be expressed as gf, without the little circle. 

So far, so good. But, the next abuse may be a little harder to swallow. The 
language and notation normally used with generic binary operations is borrowed 
from multiplication. We have already spoken, e.g., of disjoint cycle factorizations. 
We will occasionally go even further and describe gf as a product. 

3.1.11 Convention. If /, g <G S m , then the composition gof~gf is also known 
as the product of g and /. 

While o(S m ) = m\ may be large, it is finite. In principle, at least, all (ml) 2 
products of its elements can be tabulated explicitly in a so-called Cayley table. 
A Cayley table for S3 can be found in Fig. 3.1.1. 





e 3 


(12) (3) 


(13) (2) 


(1) (23) 


(123) 


(132) 


«3 


e 3 


(12) (3) 


(13) (2) 


(1) (23) 


(123) 


(132) 


(12) (3) 


(12) (3) 


"3 


(132) 


(123) 


(1) (23) 


(13) (2) 


(13) (2) 


(13) (2) 


(123) 


e 3 


(132) 


(12) (3) 


(1) (23) 


(1) (23) 


(1) (23) 


(132) 


(123) 


ej, 


(13) (2) 


(12) (3) 


(123) 


(123) 


(13) (2) 


(1) (23) 


(12) (3) 


(132) 


^3 


(132) 


(132) 


(1) (23) 


(12) (3) 


(13) (2) 


e 3 


(123) 



Figure 3.1.1. Cayley Table for S3. 



'After Sir Arthur Cayley (1821-1895). 



3.1. Function Composition 



181 



Because function composition is not commutative, care must be exercised when 
reading Cayley tables. 

3.1.12 Convention. In Cayley tables associated with this book, fg is found in 
row / and column g. 

Lemma 3.1.8 guarantees that, in a Cayley table for S m , there are no missing 
entries, and there are no entries that do not come from S m . Any two elements of 
S m may be composed, and the result is another permutation in S m . It turns out 
that some subsets of S m also exhibit this closure property. 

3.1.13 Definition. A nonempty subset G of S m is closed if fg G G for all 

/,geG. 

We have already proved that/, geG implies fg G S m . That's not the point. The 
issue is whether the composition is an element of the subset G. 

3.1.14 Example. Of the 63 nonempty subsets of S3, only six are closed. Apart 
from S 3 , itself, the other five are {e 3 }, {e 3 , (12)(3)}, {e 3 , (13)(2)}, {e 3 , (1)(23)}, 
and { 63, (123), (132)}. If S is one of the remaining 57 nonempty subsets of S3, there 
exist permutations f,g G S such that fg ^ S. 

From our perspective, there is a kind of aristocracy among the subsets of S m . The 
closed subsets are called subgroups. □ 

3.1.15 Definition. Let G be a (nonempty) closed subset of S,„. Then G is a 
subgroup of S m , or a permutation group of degree m. 

In biology, a riparian habitat is found at the boundary of water and land. Life 
occurs in its richest diversity in the vicinity of such natural boundaries. A similar 
richness may frequently be found near the boundaries of mathematical disciplines. 
That is where we are now, at the boundary between combinatorics and algebra. 
Because every finite group is isomorphic to a permutation group, the case is 
sometimes made that combinatorial group theory embraces all finite group theory. 
At best, that viewpoint is misleading. Two permutation groups that are isomorphic 
as abstract groups may have very different combinatorial properties. It is the com- 
binatorial properties of permutation groups that are of interest in this chapter. 

One final pedagogical issue needs to be discussed. The group S m has been 
defined in terms of the permutations of V = {1,2, ... ,m}. The fact that V is a 
set of numbers is beside the point. We have used V because it is convenient. We 
might just as well have discussed the set of permutations of Y = {yi,yi, ■ ■ ■ ,ym}, 
denoting it Sy. (In that notation, S m becomes Sy.) Strictly speaking, elements of Sy 
permute the y's, whereas elements of S m permute their subscripts. But, the "action" 
is the same. For our purposes, S m and Sy are clones. When the time comes to talk 
about permutations of Y, we will talk about S m acting on Y. 
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3.1. EXERCISES 

Each problem that I solved became a rule which served afterwards to solve other 
problems. 

— Rene Descartes (Discourse on Method ) 

1 Let f,g,heF 5:5 be defined by /= (1,2, 1, 3,5), g = (4, 1,5,2,2), and 
h = (1,3, 1,3, 3). Compute 

in) fog. (b) go/. {Of oh. (d) hof. 

(e) goh. ({) ho g. (g) fogoh. (h) / o h o g. 

(i) gofoh. (j) go hof. (k) hofog. (1) hog of. 

(m)fof. (n) gog. (o) hoh. (p) ho goh. 

2 Find the images of /, g, and h in Exercise 1 . 

3 Let f,g,heS 5 be defined by / = (1)(253)(4), g = (13425), and h = 
(14)(25)(3). Find the disjoint cycle factorization of 



(a) fg. 


(b) g/. 


(c) fh. 


(d) 


(e) gh. 


(f) 


(g) 


(h) y^g. 


(i) gfh. 


0) 


(k) hfg. 


(i) 


(m) ff. 


(n) jQf. 


(o) M. 


(p) ^g- 


(q) r 1 - 


(r) r 1 - 


(s) hr 1 . 


(t) rV- 



4 Let/,gG F 6 , 6 be defined by/= (1,3,6,4,2,5) and g = (2,3, 1,5,6,4). 

(a) Express / o g as a sequence. 

(b) Express g o/ as a sequence. 

(c) Express / _1 as a sequence. 

(d) Find the disjoint cycle factorization off. 

(e) Find the disjoint cycle factorization of g. 

(f) Use your answer to part (a) to express / o g in disjoint cycle notation. 

(g) Use your answers to parts (d) and (e) to find the disjoint cycle factorization 
of/og. 

5 Find an appropriate expression for the unique function / : D — > R that satisfies 
/(l)=3,/(2) = l,and/(3)=2 

(a) when D = R is the set of real numbers and /(x) is a polynomial of degree 2. 

(b) when D = R = {1, 2, 3} and / G F33 is interpreted as a sequence. 

(c) when / G S3 is expressed in disjoint cycle notation. 

6 Exhibit the Cayley table for S2. 
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7 Let p G S3. Show that {p} is a subgroup of S3 if and only if p = £3. 

8 Explain, in words, how the Cayley table in Fig. 3.1.1 can be used to find p~ l 
for any permutation p G S3. 

9 A curious fact about the Cayley table in Fig. 3.1.1 is that, apart from the 
headings, no element of S3 occurs twice in any row or column. Prove that this 
property is valid in the Cayley table for S m for all m > 2. 

10 Let A 3 = {e 3 , (123), (132)}. 

(a) Exhibit the 3 x 3 Cayley table for A3. 

(b) Prove that A3 is a permutation group. (Hint: Were you able to find an 
element of the set A3 to fill every place in the table?) 

(c) In what sense is the Cayley table you constructed for A3 "symmetric"? 
Explain the implications of symmetry for this binary operation on A3. 

11 Prove that 

(a) G= {e 4 ,(12)(3)(4),(l)(2)(34),(12)(34)} is a permutation group of 
degree 4. 

(b) G = {e 4 , (12)(34), (13)(24), (14)(23)} is a subgroup of S 4 . 

(c) G = {e 4 , (1234), (13)(24), (1432)} is a permutation group of degree 4. 

12 Prove that 

(a) S = {(123), (132)} is not a subgroup of S3. 

(b) S = {(12), (3)} is not a permutation group. 

(c) S = {(123) (4), (1)(2)(34)} is not a subgroup of S 4 . 

13 Prove or disprove that 

(a) G= {e 3 ,(12)(3),(13)(2),(l)(23)} is a subgroup of S 3 . 

(b) S = {e 5l (12345), (13524), (14253), (15432)} is a subgroup of S 5 . 

(c) S = {e 5 , (12345), (13245), (14235), (15234)} is a subgroup of S 5 . 

14 Let p G S m . Define p 1 = p and p n = p n ~ 1 p, n > 1. Describe the infinite 
sequence /7 1 ,/? 2 ,/? 3 , . . . 

(a) if p = e m . 

(b) if m = 2 and/? = (12). 

(c) if m = 3 and/? = (123). 

(d) if m = 4 andp = (1234). 

(e) if m = 5 and/? = (12345). 

(f) if m = 6 and/5 = (123456). 

15 Let p G S m . Prove that {p n : n > 1} is closed. (See Exercise 14.) 
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16 Let/, g G S m . Suppose fg = e m . Prove that gf = e m . In other words, g =f 1 if 
and only if either criterion in Theorem 3.1.7 is satisfied. 

17 Write out the Cayley table for the alternating group A4 = {£4, (12) (34), 



(13)(24), (14)(23), (123)(4),(124)(3), (132)(4), (134) (2), (142)(3), (143)(2), 



(1)(234), (1)(243)}, thus proving that it is a permutation group. 

18 Find four different permutation groups of degree 5. (Prove that each of them is 
closed). 

19 Let / G F, h „. If g, h G F Bi „ are (both) inverses of/, prove that g = h. 

20 Prove that function composition is associative. 

3.2. PERMUTATION GROUPS 

Perfection is achieved, not when there is nothing more to add, but when there is 
nothing left to take away. 



It is customary to omit cycles of length one when using disjoint cycle notation. 
Instead of p = (1748)(2)(36)(5), for example, one usually writes p = 
(1748) (36). The 1-cycles are still there, they just can't be seen. It's as if they 
were invisible. The convention is that numbers which do not appear are understood 
to be fixed points. 

3.2.1 Example. A Cayley table for S3 with the 1-cycles suppressed can be found 
in Figure 3.2.1. (Compare with Fig. 3.1.1.) □ 

With the fixed points suppressed, how is one to know whether (12) is a permuta- 
tion in 5*2, a permutation in S3 with an invisible 1 -cycle, or, for that matter, a 
permutation in S% with six fixed points? Let us agree that whenever the number 
of fixed points is an issue (most of the time), the degree of the permutation (the 
m in S m ) will have to be made clear, one way or another. A second issue arising 
from the new convention leads to another abuse of language. 



— Antoine de Saint Exupery 




(12) 



(13) 



(23) 



(123) 



(132) 



e 3 
(12) 

(13) 
(23) 
(123) 
(132) 



e 3 
(12) 

(13) 
(23) 
(123) 
(132) 



(12) 

e 3 
(123) 
(132) 
(13) 
(23) 



(13) 
(132) 

e 3 

(123) 
(23) 
(12) 



(23) 
(123) 
(132) 

e 3 
(12) 
(13) 



(123) 
(23) 
(12) 
(13) 

(132) 



(132) 
(13) 
(23) 
(12) 

e 3 
(123) 



Figure 3.2.1. Cayley Table for S3. 
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3.2.2 Definition. A cycle is nontrivial if its length is greater than 1. A 
permutation having just one nontrivial cycle in its disjoint cycle factorization 
will, itself, be referred to as a cycle. A k-cycle in S m is any permutation of cycle 
type [k, l m - k ]. 

Observe that, apart from ej,, every permutation in S3 is either a 2-cycle or a 
3-cycle. 

With both the 1 -cycles and the little circle representing function composition 
suppressed, how should (123) (45) € S(, be viewed? Is it a "single" permutation, 
or a composition off = (123) and g = (45)? In fact, the composition (123) o (45) 
is the permutation with disjoint cycle factorization (123)(45). Omission of 
the 1-cycles and the little circle leads to confusing the composition (123) (45) 
with the permutation (123)(45). Since the two are equal, this confusion is 
harmless. 

Observe that no similar ambiguity arises for (123) (34) S S^. Because the cycles 
are not disjoint, this one can only be viewed as the composition off = (123) and 
h = (34). The disjoint cycle factorization of p =fo h is (1234). (Confirm it.) 

Another technical issue is this: In the disjoint cycle factorization of a 
permutation, the order of the "factors" is immaterial, e.g., (123) (45) = (45) 
(123). On the other hand, viewing (123)(45) = (123) o (45) as the composition 
of / = (123) and g= (45) raises an obvious concern. Function composition is 
not commutative! 

Recall that for mathematical statements true means "always true", which leaves 
false meaning not "always true". That's not the same thing as "always false". In 
fact, 

(123) o(45) = (123)(45) 
= (45)(123) 
= (45)° (123), 

i.e., permutations / = (123) and g = (45) commute. Indeed, from Lemma 2.4.4, 
the inequivalent cycles of a permutation are disjoint. Because disjoint cycles always 
commute, the "obvious concern" from the previous paragraph turns out to be a 
false alarm. But, enough about conventions. Let's get back to the combinatorics 
of permutations. 

Recall that the nonempty closed subsets of S m are members of an aristocracy; 
they are the subgroups. There is nothing inherently difficult about this concept. 
The difficult part is verifying closure, an exercise that seems to require constructing 
an entire Cayley table. One way around this difficulty might be to create closed 
subsets by design. 



This language is more than a little ironic. As we will see, the significance of the 1-cycles is far from 
"trivial". 
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3.2.3 Definition. If p G S m , let p° = e m and p" = p o p" w > 1. Denoted 
o(p), the order" of p is the smallest positive integer k such that p k = e m . 

Observe that o(e m ) = 1 for all m. (In particular, order is independent of degree.) 
Before getting to a proof of the existence of o(p), let's see some examples. 

3.2.4 Example. Let p = (123456) G S m (where m > 6). Then (check the com- 
putations) 



p ] 


= pe m = 


p= (123456), 




P 2 


= PP X = 


(123456)(123456) = 


(135)(246), 


P 3 


= PP 2 = 


(123456)(135)(246) 


= (14)(25)(36), 


P A 


3 

= PP = 


(123456)(14)(25)(36) = (153)(264), 


P 5 


4 

= PP = 


(123456)(153)(264) 


= (165432), 


P 6 


= PP 5 = 


(123456)(165432) = 





so o(p) = 6. (It follows from Lemma 2.4.1 that o(g) — k for any fc-cycle g G S m .) 
Observe that the next few powers of p are 

P 1 =PP 6 =pe m =p, p 8 = pp 7 = pp = p 2 , p 9 = pp s = pp 2 = p\ 

and so on. In particular, p n = p 6 = e m . 

If / = (12) (3456) G St, then / is a permutation of degree 7. To find its order, 
observe that 

f =/= (12)(3456), 
f = (12)(3456)(12)(3456) = (35) (46) 
f = (12)(3456)(35)(46) = (12)(3654) 
f 4 = (12)(3456)(12)(3654) = e 7 , 

so o(f) = 4. (Does f n = e 7 ?) □ 

3.2.5 Lemma. Let n be a positive integer. Suppose p G 5 m ftfli ore/er o(/?) = A; 
T/ien p" = e m if and only if k is a factor of n. 

Proof. Dividing n by k yields a quotient q and remainder r = n — kq, where 
0 < r < k. Because function composition is associative, p n = p kq+r = (p k ) q p r = 
( e m) 9 p r — e mP r — p" ■ In particular, p" — e m if and only if p r = e m . Because 
r < k = o(p).p r = e m if and only if r = 0 if and only if n = kq. ■ 



The word order has already caused so much semantic difficulty that it may seem unwise to give it still 
another meaning! 
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3.2.6 Theorem. Ifp G S m , then o(p) is the least common multiple of the lengths 
of the cycles in the disjoint cycle factorization of p. (In particular, o(p) exists.) 

Proof. If p — e m , there is nothing to prove. So, suppose p =^ e m . Then 

p = C p (h)C p (h) ■ ■ ■ C p {i r ), 

where C p (i t ), 1 < t < r, are the nontrivial inequivalent cycles of p. In the aftermath 
of Definition 3.2.2, this means p = P1P2 • • -p r , where the cycle p, <E S m differs from 
C p (i,) at most by some fixed points. Because inequivalent cycles of p are disjoint, 
and disjoint cycles commute, p" — p\p\ ■ ■ - p". 

Observe that e m = p n = p\{_p\ ■ ■ -p") if and only if 

(pir l =p"---p" r - (3.8) 

If pi 7^ e m , then p"(i) = j for some j ^ i. Because any fixed point of p\ is a fixed 
point of p\, this can happen only if i,j S C p (i\), only if both i and j are fixed points 
of pi,P3, ■ ■ ■ ,p r - So, the left-hand side of Equation (3.8) sends j to i, but the right- 
hand side fixes j. This contradiction proves that p\ — e m . Since any one of the 
cycles could have been first, p" — e m if and only if p" = e m , 1 < / < r. By Lemma 
3.2.5 (and Lemma 2.4.1), p" = e m if and only if n is a multiple of o(p t ), the length 
of C p (i,). Thus, p" = e m if and only if n is a common multiple of these lengths, the 
least of which is o(p). ■ 

3.2.7 Example. Let /= (3,8,5,6,7,2, 9,4, 1) S Sg. Apart from establishing 
that <?(/) exists, Theorem 3.2.6 illustrates one of the benefits of disjoint cycle 
notation. From the expression / = (13579) (2846), it is easy to see that <?(/) = 20. 

What about p = (2, 3, 1,5, 4)? Can you see that o(p) = 6 without expressing it 
in the form p = (123)(45)? Let's confirm that o(p) — 6. (Check the computations.) 

p 2 =(123)(45)(123)(45) = (132), 
p 3 = (123)(45)(132) = (45), 
p 4 = (123)(45)(45) = (123), 
p 5 = (123)(45)(123) = (132)(45), 
p 6 = (123)(45)(132)(45) = e5 . 

Beyond confirming Theorem 3.2.6, this example illustrates some other facts: 

1. While every fixed point of p is a fixed point of p n for all n, the converse is 
false. For example, p 2 = (132) fixes 4 and 5, but p = (123)(45) does not. 

2. Apart from knowing to write es in the last step, the degree of p was irrelevant 
to the computation of its order. 

3. Because p o p 5 = e 5 , it must be that p 5 = p~ l . Similarly, p 4 = (p 2 )~ l = 
(p~ 1 ) 2 , call it p~ 2 , and p 3 = p~ 3 . □ 
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3.2.8 Theorem. Let p e S m . If o(p) = k > 1, then p~ l = p k ~ l . 

Proof. By Exercises 16 and 19 of Section 3.1, p~ l is a name for the unique 
permutation / <E S m that solves the equation pf = e m . So, the theorem is a 
consequence of pp k ~ l — p k — e m . ■ 

3.2.9 Definition. Let p G S m . The cyclic group generated by p is (p) = 
{p n : 0 < w < o(p)}. 

3.2.10 Example. If o(p) = k, then p k = e m , so 

(p) = {e m ,p,p 2 ,...,p k - 1 } 
= {p,p 2 ,...,p k - [ ,p k }. 

Observe that o((p)) = k = o(p); the number of elements in the subgroup (p) is 
equal to the smallest positive integer k such that p k = e m . In particular, calling k 
the order of p is no great abuse of language after all. 
As in Example 3.2.4, 

p k+l = pp k = pe m = p, 
p k+2 = pp k+l = pp = p 2 , 
p k+3 = pp k+1 = pp 2 = p 3 , 

and so on. Evidently, the infinite sequence 

p°,p\p 2 , ... = e n ,p\. . . ,p k ~ x ,e m ,p 1 ,. . . ,p k ~\e m ,p 1 ,. . . ,p k ~ l ,e m , . . . 

is cyclic with period k. In particular, 

{p n : „ > 0} = { p n : 0 < n < k} 

= {e m ,P,p 2 ,---,p k ~ 1 } 

= (P), (3-9) 
which explains why ( p) is a cyclic group. □ 
We now justify the word group in Definition 3.2.9. 

3.2.11 Theorem. If p S S m , then (p) is a subgroup of S m . 

Proof. Because (associativity and induction) p r p s = p r+ \r,s > 0, the nonempty 
subset of S m on the left-hand side of Equation (3.9) is closed. ■ 



Theorem 3.2.11 gives us the means to construct infinitely many permutation 
groups. Let m be a fixed but arbitrary positive integer and let p be a fixed but 
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arbitrary permutation in S m . Then G = (p) is a permutation group of degree m and 
order o(G) = o(p). 

Is every permutation group cyclic? No. In order for G to be cyclic, it must have a 
generator, i.e., there must be some p £ G such that o(p) = o(G). Consider G = S3, 
for example. From Theorem 3.2.6 and Fig. 3.2.1, the order of an element of S3 is 1, 
2, or 3. Because S3 contains no element of order 6, it is not cyclic. On the other 
hand, every p £ S3 is the generator of a cyclic subgroup of S3, e.g., ((123)) = 
{e 3 , (123), (132)}. 

More generally, if p is an arbitrary element of an arbitrary permutation group G 
then, because G is closed, every element of (p) must be an element of G, i.e., 

P eG^(p)cG. (3.10) 

From this important observation, we can deduce the following. 

3.2.12 Corollary. Let G be a permutation group of degree m. Then 

1. e m G G and 

2. p e G^> p- 1 e G. 



Proof. Because G cannot be empty, it contains a permutation that may as well be 
denoted p. Suppose o(p) = k. If k=\, then p~ ] = e m = p G G. Otherwise, 
by Implication (3.10), (p) — {e m , p,... ,p k ~ 1 } C G. Thus, e m G G and, by 
Theorem 3.2.8, p~ l = p k ~ x G G. ■ 

Let's summarize. Suppose G is a permutation group of degree m. Then, by 
definition, G is nonempty and closed with respect to the associative operation of 
function composition. By Corollary 3.2.12, G contains the identity permutation 
and the inverse of each of its elements. In addition, the cyclic subgroup idea pro- 
vides lots of examples. Another way to obtain examples comes from the following 
construction. 

3.2.13 Definition. Let G be a permutation group of degree m. The stabilizer 
subgroup of x £ {1,2,..., m} is the subset of G consisting of those permutations 
that fix x, i.e., 

G x = {peG:p(x)=x}. (3.11) 

By Corollary 3.2.12, e m G G. Because e m (x) = x, e m G G x . So G x is not empty. If 
f,g G G. c , then/g(x) =f(g(x)) =f(x) = x, so fg G G*. Therefore, G x is closed and 
so, as its name implies, G x is a subgroup. 



'These are the axioms for an abstract group. 
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3.2.14 Example. Let G = S4. If x — 4, then, because we have decided to make 
the fixed points invisible, G x looks like S3. Because we made a mental note not to 



3.2.15 Example. Let G=(f), where / = (12)(3456) G S 7 . From Example 
3.2.4, G= {g 7 ,(12)(3456),(35)(46),(12)(3654)}. If x=\ or x = 2, then 
G x = {e 7 , (35)(46)} = (/ 2 ). If 3 < x < 6, then G x = {e 7 } = (e 7 ). If x = 7, then 
G x = G. 

Suppose G = (p), where p = (123) (45) G S 5 . Then, from Example 3.2.7, G = 
{e 5 ,(123)(45),(132),(45),(123),(132)(45)}. In this case, Gj = G 2 = G 3 
= {e 5 , (45)} = (/? 3 ) and G 4 = G 5 = {e 5 , (132), (123)}. (Observe that (p 2 ) = 
((123)) = (/).) □ 

3.2.16 Definition. If G and // are subgroups of 5 m and if H is a subset of G, then 
// is a subgroup of G. 

We have found two ways to create groups by design, namely, the cyclic 
subgroups (p), where p is a permutation, and the stabilizer subgroups G x , where 
G is an existing group and x is a number. While stabilizer subgroups can be cyclic 
(see Example 3.2.15), they can also be noncyclic (see Example 3.2.14). 

The discussion of stabilizer subgroups has opened the door to some other 
possibilities. If G is a permutation group of degree m, consider the set 
S = {g £ G : g(x) ~ y}, where y ^ x. Because e m (x)—x^y, S is precluded 
from being a subgroup by part 1 of Corollary 3.2.12. 

That's interesting. The subset G x consisting of the permutations that map x to x is 
a subgroup of G, but the subset S consisting of the permutations that map x to y ^ x 
is not. Nevertheless, at least when it isn't empty, S is a close relative of G x . If / S S 
and p G G x , then fp(x) — f(p(x)) =f(x) = y, i.e., g = fp G S. In other words, for 
any / G S, fp G S for every p G G x . 

3.2.17 Definition. Let G be a permutation group of degree m and suppose 
/ G G. If H is a subgroup of G, then the subset 



is a (left) cosef of //. 

3.2.18 Theorem. Let G be a permutation group of degree m. Suppose f G G. // 
/(x) = then the subset of G consisting of all the permutations that send x to y is 
the coset fG x , i.e., 



forget the fixed points, G x ^ S3. 



□ 



fH = {fp:peH}cG 



(3.12) 



fG x = {geG: g(x) = y}. 



(3.13) 



Proof. By the discussion preceding Definition 3.2.17,/G C C S — {g G G : g(x) — 
y}. To prove the converse, suppose g G G. If g(x) = y =/(*), then /~'g( x ) = 
Z" 1 (#(*)) =/ -1 (:y) = *• Therefore, /"'g =p for some /? G G x , i.e., g €fG x . 



3.2. Permutation Groups 
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3.2.19 Example. Suppose G = ((123)(45)) = {e 5 , (123) (45), (132), (45), (123), 
(132)(45)}. Then 

G 1 = {geG:g(l) = l} 
= {e 5 ,(45)}. 

Iff = (123) (45), then/(l) = 2. Observe that 

/G 1 = (123)(45){ e5 ,(45)} 
= {(123)(45),(123)} 
= {geG:g(l) = 2}, 

confirming Theorem 3.2.18. If h = (123), then h{\) = 2. Although h ^ f,hG x = 
(123){« 5 , (45)} - {(123), (123)(45)} =fG 1 . 

Similarly, G 5 = {e 5 , (132), (123)}. If / = (45), then / maps x = 5 to v = 4. 
Because disjoint cycles commute, 

/G 5 = {(45),(132)(45),(123)(45)} 

= {geG:g(5)=4}. □ 

3.2.20 Example. Let G = {e 4 , (12), (34), (12)(34)}. (Confirm that G is closed 
but that it is not cyclic.) The stabilizer subgroup G\ = \e\, (34)} = ((34)). If 
/ = (12), then f(l) = 2. By Theorem 3.2.18, the subset of G consisting of all 
permutations that map 1 to 2 is 

fd = (12){e 4 , (34)} 
= {(12),(12)(34)}. 

Indeed, the complement of fG\ in G is G\, no element of which maps 1 to 2. □ 

Suppose G is a permutation group of degree m. Let x, y <E { 1, 2, . . . , m}. If a total 
of k permutations of G fix x, how many map x to y? Suppose, e.g., that G is the 
group from Example 3.2.20. If x = 1, then a total of k = o{G\) = 2 permutations 
of G fix x. If y — 2, then o(fG\) = 2 as well. If y = 3, however, no permutation of 
G maps x to y. 

Okay, let's rephrase the question. Suppose a total of k permutations of G fix x. If 
f(x) — y =^ x for some / € G, how many permutations of G map x to y? Because 
{geG: g(x) = y} =fG x = {fp : p <G G x }, it seems that o(fG x ) — o(G x ), unless 
fp=fq for some p,q^G x , where p ^ q. But, if fp=fq, then p — e m p = 
(f~ l f)p =r\fp) =r X (fq) = (.r\f)q = e m q = q, i.e., p = q. Thus, 



o(fG x ) = o(G x ). 



(3.14) 
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3.2. EXERCISES 

1 Compute o(p) if 

(a) p = (123) (4567) (89). (b) p = (123)(45678). 

(c) p = (12) (34) (56) (78). (d) p - (123) (456) (789). 

2 Let G = S 4 . 

(a) With the 1-cycles omitted, exhibit the disjoint cycle factorizations of all 24 
permutations in G. 

(b) How many of the permutations of G are cycles? 

(c) How many of the permutations of G are derangements? 

3 Exhibit the disjoint cycle factorization of p", 1 < n < 10, when 
(a) p= (1234). (b) p= (12345). 

(c) p = (123456). (d) p = (12345678). 

4 Let p = (147926853) G Sg. Without computing p 5 , explain how you can tell that 
P 5 (l)=6. 

5 Show that the permutation group 

(a) G = {e 4 , (12)(34), (12), (34)} is not cyclic. 

(b) G = {e A , (12)(34), (1324), (1423)} is cyclic. 

(c) G = A 4 , from Exercise 17, Section 3.1, is not cyclic. 

6 Show that the cyclic group G = {e 4 , (12)(34), (1324), (1423)} from 
Exercise 5(b) has two generators, i.e., find p,q G G such that p ^ q but 

(p) = (q) = G. 

7 Find all the generators of G — (p) when 

(a) p = (1234). (b) p = (12345). (c) p = (14325). 

(d) p = (123456). (e) p = (1234567). (f)p = (12345678). 

8 Let G = {e A , (12), (34), (12)(34)}. Exhibit G x when 
(a) x= 1. (b) x= 2. (c) x= 3. 

9 Let G = {e 4 , (1234), (1432), (13), (24), (12)(34), (13)(24), (14)(23)}. 

(a) Show that G is a subgroup of S4. 

(b) Exhibit G x when x = 3. 

(c) Exhibit G 4 . 

(d) Find p,q G G,p 7^ q, such that p(3) = 2 = q(3). 

(e) Show that pG?, = qGi, where p and q are the permutations you found in 
part (d). 

(f) How many different cyclic subgroups does G have? 
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10 Let G be a permutation group and suppose / G G. Prove that fG x = G x if and 
only if f(x) = x. 

11 If n is a positive integer, prove that (p"Y l = for all p G 5 m . 

12 Let // be a subgroup of a permutation group G. If /, g G G, define / ~ g to 
mean that g -1 / G //. 

(a) Prove that ~ is an equivalence relation on G. 

(b) If g G G, prove that the equivalence class to which g belongs is the coset 
gH = {gh : h G //}. 

(c) Prove that o(gtf) = o(H) for all g G G. 

(d) Prove that o(G) = o(H)r, where r is the number of different equivalence 
classes. 

(e) Prove Lagrange's theorem: o(H) exactly divides o(G). 

13 Prove that o(p~ l ) — o(p) for all p G S m . 

14 Prove that (p^ 1 } = (p) for all p e S m . 

15 Prove or disprove that (p 2 ) = (p) for all p G 5 m . 

16 Another name for a 2-cycle is a transposition. So, a transposition in S m is a 
permutation of cycle type [2, l'"~ 2 ],m > 2. 

(a) Express /? = (123) G S3 as a product (composition) of two transpositions. 

(b) Express p = (1234) G S4 as a product of three transpositions. 

(c) Express p — (123)(4567) G S m (m > 7) as a product of five transpositions. 

(d) Show that every permutation p G 5 m is a product of m- c(p) transposi- 
tions, where c(/?) is the total number of cycles, including cycles of length 
1, in the disjoint cycle factorization of p. 

17 Express (12345) as a product of four transpositions in two different ways. (See 
Exercise 16 for the definition of transposition.) 

18 Prove or disprove that every permutation in S m , m > 3, is a product (composi- 
tion) of 3-cycles. (Hints: The 3-cycles need not be disjoint; "3-cycles" is not 
the same as "three cycles".) 

19 Suppose peG, where G is a permutation group of degree m. If p(x) — y, show 
that G y p — pG x . 

20 A permutation p G S m is self-inverse if p~ l = p. 

(a) Describe, in words, how to identify self-inverse permutations from the 
Cayley table for S m . 

(b) Describe the possible cycle types for self-inverse permutations. 

21 A permutation p G S m is idempotent if p 2 = p. Describe the possible cycle 
types for the idempotent permutations. 
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22 Consider the m-cycle p = (12 . . . m) G S m . Suppose r is a fixed positive 
integer, 1 < r < m. Show that p r is a product (composition) of d disjoint 
cycles each of length m/d, where d is the greatest common divisor of m and r. 

23 Let p G S m , where m > 2. It can be proved that if p can be written one way as a 
product of k transpositions and some other way as a product of r transpositions, 
then (— 1)* = (— l) r . (See Exercise 16 for the definition of "transposition".) In 
other words, every p G S m is either odd or even depending on whether it can be 
written as the product of an odd or an even number of transpositions. Let 
A m = {p G S m : p is even}. 

(a) Prove that A m is a subgroup of S m . (It is called the alternating group of 
degree m.) 

(b) Prove that the set of odd permutations, S m \A m , is not a subgroup. 

(c) Prove that S,„\A m = (12)A,„ is a coset of A m . 

(d) Prove that o(A m ) = \m\. 

(e) Confirm that A4 is the group in Exercise 17, Section 3.1. 

24 Describe the cycle types of the permutations p G S m that satisfy 

(a) p- 1 =p 2 =^p. 

(b) p- 1 =p 3 =^p. 

25 Let p G S m . Prove that the cyclic group (p) is the intersection of all subgroups 
of S,„ that contain p. 



3.3. BURNSIDE'S LEMMA 

When I am working on a problem I never think about beauty. But when I have fin- 
ished, if the solution is not beautiful, I know it is wrong. 

— Buckminster Fuller 

Getting from point a to point b can sometimes be a problem. Consider the case in 
which a, b G V = { 1 , 2, . . . , m}. Let G be subgroup of S m , and suppose the only way 
to get from a to b is via some permutation peG that maps a to b. If G were a 
transportation system, the ideal situation would be one in which, for any 
a, b G V, there is a p G G such that p(a) = b. But, few real-life systems are ideal. 
Take the San Francisco Bay Area, for example, where public transportation is 
relatively good. If a and b are both in Oakland, an AC-Transit bus will take passen- 
gers from point a to point b. If a and b are in San Francisco, MUNI will do the job. 
Getting from point a in Oakland to point b in San Francisco, however, is another 
matter. If the system were enlarged to include BART, there would be no problem. 
But, anyone restricted to AC-Transit or MUNI would be out of luck. 

The Bay Area Rapid Transit district. 
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3.3.7 Definition. If G is a permutation group of degree m, then x,y £ V = 
{1,2, ... ,m} are equivalent modulo G, written 



if there is a permutation p G G such that p(x) = y. 

For the case modeled by Bay Area buses, any two points in Oakland are equiva- 
lent, as are any two points in "the City". Without BART, however, no point of 
Oakland is equivalent to any point in San Francisco. The two cities are in different 
transit districts or equivalence classes, language that depends on the next result. 

3.3.2 Theorem. If G is a permutation group of degree m, then equivalence 
modulo G is an equivalence relation. 

To prove the theorem, it will be necessary to verify the following: For all 

x,y,z£ V = {\,2,...,m}, 

1. x = x (mod G). 

2. x = y (mod G) => y = x (mod G). 

3. x = y (mod G) and y = z (mod G) => x = z (mod G) . 

Proof of Theorem 3.3.2. By Corollary 3.2.12, e m G G. Because e m (x) 
A. 1 < x < m, criterion 1 is verified. 

If x = y (mod G), there is a permutation p G G such that p(x) = y. By Corollary 
3.2.12, p _1 G G. Because p(x) = y if and only if p^'Cy) = x, criterion 2 is proved. 

If x = y (mod G) and y = z (mod G), there are permutations /, g G G such that 
/(x) = y and g(y) = z. Because G is closed, /? = gf G G. Since p(x) = 
gf(x) = g(/(x)) = g(;y) = z, criterion 3 is established. ■ 

Equivalence classes arising from the action of a permutation group are of funda- 
mental importance in combinatorial enumeration. 

3.3.3 Definition. Let G be a permutation group of degree m. Equivalence 
classes modulo G are called orbits of G. The orbit of G containing x is 



In this definition, x and p(x) are numbers. In particular, the orbits of G are sub- 
sets, not of G, but of V = {1, 2, . . . , m}. From the general theory of equivalence 
relations, if O x and O y overlap at all, they are identical, i.e., the different orbits 
of G comprise a partition of V. In the bus metaphor, the orbit of a point in San 
Francisco is the entire city, and the San Francisco orbit is disjoint from the Oakland 
orbit. 



x = y (mod G) 



(3.15) 



O x = {p(x) :peG}. 



(3.16) 



3.3.4 Example. If G = {e 4 , (12), (34), (12)(34)}, then O x = {p{\) : p G G} = 
{1,2,1,2}, multiplicities included. Eliminating repetitions, 0\ — {1,2}. Because 
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2sOi, it follows from the general theory that O2 = 0\. This can, of course, be 
confirmed directly: O2 = {p(2) : p € G} = {2, 1, 2, 1}, multiplicities included. 
Similarly 0 3 = {3, 4} = 0 4 . (Check it.) 

It is important to distinguish the subset {3,4} from the cycle (34), and the orbit 
<9i = {l,2} from the stabilizer subgroup G\ — {e^, (34)}. Whereas the orbit 
O x C {1,2, .. . , m}, the stabilizer subgroup C,cGc S m . In particular, O x is a 
set of numbers and G x is a set of permutations. 

Equivalence modulo H = {e 4 , (12)(34), (13) (24), (14)(23)} is trivial. There is 
only one orbit, namely, 0\ = O2 = O3 = O4 = {1,2,3,4}. (Check it.) Ironically, 
what is trivial for permutation groups is ideal for transportation systems. Equiva- 
lence modulo H is trivial because, for all a, b G { 1 , 2, 3, 4}, there is a permutation in 
H that maps a to b. As we will soon see, however, permutation groups affording 
trivial equivalence relations are, themselves, anything but trivial. □ 

3.3.5 Definition. Let G be a permutation group of degree m. Then G is transitive 
if it has only one orbit, i.e., if for every choice of x and y in V = {1,2, ... ,m} there 
exists a permutation p G G such that p{x) = y. 



3.3.6 Example. While the group 

H={«4,(12)(34),(13)(24),(14)(23)} ) 

from Example 3.3.4, is transitive, the group 

AT={ es ,(12)(34),(13)(24),(14)(23)} 

is not. The difference, of couse, is a matter of degree. Being of degree 4, the single 
orbit of //is 0\ = O2 = O3 = O4 = {1,2,3,4}. Because it is of degree 5, the 
orbits of K are Oi = 0 2 = O3 = 0 4 = {1,2, 3,4} and 0 5 = {5}. 

Perhaps the easiest way to see that S m is transitive is via sequence notation. 
Suppose i,jeV={l,2,...,m}. If / = (/(l), f(2), . . . ,f(m)) G F m>m , then f(i) 
is the number in the ith component of the sequence. With j occupying that position, 
there are (m — 1)! permutations / G S m map to j. □ 



3.3.7 Lemma. Let G be a permutation group of degree m. If x G {1,2,..., m}, 
then the number of elements in the orbit to which x belongs is 

o(O x )=°P-y (3.17) 
o(G x ) 



Proof. The set O x = {p(x) : p G G} appears to contain o(G) elements but, as we 
saw in Example 3.3.4, this includes the multiplicities that arise when 
p\(x) = y = P2{x) for two different permutations p\,p2 G G. However, from Theo- 
rem 3.2.18, if/(x) = y, then {p G G : p(x) = y} =fG x . Hence, as p runs through G, 
y occurs as the value of p(x) exactly o(fG x ) times. Moreover, by Equation (3.14), 
the multiplicity o( fG x ) — o(G x ) is the same for every y G O x . ■ 
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Having counted the elements in each orbit, how hard can it be to count the num- 
ber of orbits? If every orbit had the same size, counting them would be as easy as 
dividing m by o{O x ) for some fixed but arbitrary it {1,2,..., m}. However, orbits 
need not have the same size. (See, e.g., Example 3.3.6, where the orbits of K are 



There is, in fact, a beautiful way to calculate the number of orbits of a permuta- 
tion group, a method that is as powerful as it is unexpected. The significance of this 
result may justify a brief anecdote about its history. 

William Burnside (1852-1927) published the lemma in his 1897 book on finite 
groups, along with a footnote citing an 1887 article by Georg Frobenius (1849- 
1917) as its source. When the footnote was inadvertently omitted from the book's 
second edition, the result came to be known as "Burnside's lemma". In fact, the 
same idea had appeared even earlier in an 1847 article by Cauchy (1789-1857). 
Before we can state this famous result, one more bit of notation is needed. 

3.3.8 Definition. Denote by F(p) the number of fixed points of p € S m . 

3.3.9 Burnside's Lemma. Let G be a permutation group with a total oft orbits. 
Then t is the average of the numbers of fixed points of the permutations in G. That 
is, 



3.3.10 Example. For the group H = {e 4 , (12)(34), (13) (24), (14)(23)}, from 
Example 3.3.6, F(e 4 ) = 4, and F((12)(34)) = F((13)(24)) = F((14)(23)) = 0. 
Because the average of these four numbers is 1 , H has just one orbit, confirming 
that it is transitive. 

If K = {e 5 , (12)(34), (13)(24), (14) (23)}, then F(e 5 ) = 5, and F((12)(34)) = 
F((13)(24)) = F((14)(23)) = 1. (This would be a natural time to have misgivings 
about suppressing 1-cycles.) The average of these numbers of fixed points is 
(5 + 1 + 1 + l)/4 = 2, consistent with our observation in Example 3.3.6 that K 
partitions { 1 , 2, 3, 4, 5} into two orbits. □ 

3.3.11 Example. Because S m is transitive, it has just one orbit. It follows from 
Brunside's lemma that, on average, the permutations of S m have one fixed point. 
(Recall from Section 2.3 that the fraction of permutations in S m having exactly 
one fixed point is something else entirely.) 

In S 3 ,F(e 3 ) = 3,F(12) = F(13) = F(23) = 1, and F(123) = F(132) = 0. So 
(as predicted), 



For more details, see Peter M. Neumann, A lemma that is not Burnside's Math. Scientist 4 (1979), 
133-141. 



Ox = {1,2,3,4} and 0 5 = {5}.) 




(3.18) 



[3 + 1 + 1 + 1+0 + 0]/6=l. 



□ 
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Proof of Burnside's Lemma. Define S = {(g,j) : g S G and g(j) = j}. Then S is 
the set of all ordered pairs (g,j) in which j is a fixed point of g. Because F(g) of 
these ordered pairs begin with g, 

o(S) = J2F(g)- (3-19) 

g€G 

On the other hand, exactly o(Gj) permutations of G fix j. Therefore, 

m 

o(S) = Y / o(G J ) 

by a rearrangement of Equation (3.17). 

Let Ci, C2, . . . ,C f be the distinct orbits of G, so that Oj £ {C\, C2, ■ ■ ■ , 
C,}, 1 <j<m. Then, continuing from Equation (3.20), 



o(S)=o(G)±^ o ^ 

i=l iec, 



Note that, in the second of these summations, l/o(C,) is added to itself o(Q) times, 
i.e., 

= to(G). (3.21) 
Comparing Equations (3.19) and (3.21) completes the proof. ■ 
3.3.12 Corollary. If G is a subgroup of S m , then 



vv/f/i equality if and only if G is transitive. 



Proof. Because t = 1 if and only if G is transitive, the result is an immediate 
consequence of Equation (3.18). ■ 

3.3.13 Example. From Example 3.3.4, the orbits of G={e4,(12), (34), 
(12) (34)} are {1,2} and {3,4}. Averaging the fixed points of the permutations 
in G yields |(4 + 2 + 2 + 0) = 2> 1, confirming that G is not transitive. □ 
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A subgroup G of S m is doubly transitive if, for all Xi,X2,y\,y2 G {1,2, .. . , m}, 
where x\ ^ x-i and y\ ^ yi, there is a permutation p G G such that p(x\) — yi and 

p(x 2 ) = yi- 

This definition looks complicated, in part, because of technical considerations: If 
x\ 7^ X2 but y\ = y2, then no one-to-one function could send x\ to y\ and x 2 to y2\ if 
x\ = X2 but y\ ^ y 2 , then no function could send xi to yi and X2 to V2- Informally, G 
is doubly transitive if, for all appropriate sequences x = (x!,x 2 ) and y — (y 1; y 2 ), 
there is a permutation p G G that maps x to y. 

Would it surprise you to learn that, if m > 2, then 

^g^>* (3.23, 

with equality if and only if G is doubly transitive? It is hard to look at Inequalities 
(3.22)-(3.23) and not conjecture that, if m > 3, then the average over g G G of 
F(g) 3 is not less than 3 with equality if and only if G is triply transitive. 

Let's test this hypothesis. The numbers of fixed points of the permutations in 53 
are listed in Example 3.3.11. The average of their third powers is g(3 3 + 1 3 + 
l 3 + l 3 + 0 3 + 0 3 ) = f = 5. Five? What happened to 3? Maybe we glided too 
nimbly over the details of what "triply transitive" might mean. If S3 turns out 
not to be triply transitive, there is still hope for the conjecture. On the other 
hand, maybe the correct lower bound is not 3 but 5. (After all, 1, 2, 3, . . . is not 
the only sequence of positive integers.) Before doing anything else, let's give a 
proper definition of multiple transitivity. 

3.3.14 Definition. Let G be a subgroup of S m . Suppose 1 < r < m. Then G is 
r-fold transitive if, for all one-to-one functions/, g S F r m , there exists a permutation 
p G G such that pf = g. 

Using one-to-one functions enormously simplifies the statement of Definition 
3.3.14. To see what it means, recall that / = (xi,X2, . . . ,x r ) G F rM is one-to-one 
if and only if the x's are all different. Thus, G is r-fold transitive if and only if, 
for each of the P{m 7 r) 2 ways to choose one-to-one functions / = (x\,X2, ■ ■ ■ ,x r ) 
and g = (yi,y2, . . . ,y r ) from F nm , there is a permutation p G G such that 

P(xi) = P(f(i)) = Pf(i) = g(i) = yi, 1 <i<r. 

In other words, G is r-fold transitive if and only if, for any of the P(m, r) 2 ways to 
choose (without replacement, where order matters) sequences of distinct integers 
(x\,x 2 , . . . ,x r ) and (yi,y2, ■ ■ ■ ,y r ) from {1,2, . . . ,m}, there exists a p e G such 
that, simultaneously, p(x\) = yi 7 p(x 2 ) = y2, ■ ■ ■ , and p(x r ) = y r . 

Evidently, "transitive" is the same as "1-fold transitive" and "doubly transi- 
tive" is the same as "2-fold transitive". Moreover, every (r+ l)-fold transitive 
group is r-fold transitive. 
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3.3.15 Example. Recall that H = {e 4 , (12) (34), (13) (24), (14)(23)} is transi- 
tive. Suppose (x\,X2) = (1)2) and (y\,yi) — (2,3). The only permutation in H 
that maps x\ = 1 to y\ = 2 is p = (12) (34). Because p(2) ^ 3, no permutation in 
H simultaneously sends x\ to y\ and x 2 to y 2 , i.e., H is not doubly transitive. 

What about 54? Any function in F44 of the form (2, 3,r, s) maps x\ = 1 to 
vi = 2 and X2 = 2 to y 2 = 3. Two of these functions are permutations, namely, 
P\ = (2,3, 1,4) and p 2 = (2,3,4, 1). (In disjoint cycle notation, p\ = (123) and 
P2 = (1234).) More generally, iff, g G F rm are fixed but arbitrary one-to-one func- 
tions, then [m — r)\ permutations p G S m satisfy pf = g. In particular, S m is r-fold 
transitive, 1 < r < m. (Compare with the last part of Example 3.3.6.) □ 

Consider another example. Suppose G is permutation group of degree m > 2. 
Let ;' G V = {1, 2, . . . , m} be fixed but arbitrary. Because p(j) = j for all p in 
the stabilizer subgroup Gj, the set {j} is an orbit of Gj. Thus, Gj is not transitive. 
Suppose, however, we ignore the orbit { j} and think of Gj as a permutation group 
of degree m — 1 acting on 



If G is (r+ l)-fold transitive on V, then Gj is r-fold transitive on Vj. This observa- 
tion even has a partial converse. 

3.3.16 Lemma. Let G be a permutation group of degree m > 3. Let V = 
{1,2,..., m}, and suppose 1 < r < m. If the stabilizer subgroup Gj is r-fold tran- 
sitive on Vj = V\{j}, 1 <j ' < m, then G is (r + I) -fold transitive on V. 

Proof. Let (xi,x 2 , . . . ,x r+ i) and (yi,y2, ■ ■ ■ ,y r +i) be two one-to-one functions in 
F r+ i m . Because m > 3, there is some t <E V such that x\ 7^ t ^ y\. By hypothesis, 
there is a permutation / e G t such that f{x\)=y\. Suppose f{xk) — Zk,2< 
k<r+l. Since / is one-to-one, and the y's are all different, Zk^yi^yk, 
2 < k < r + 1 . So, another application of the hypothesis yields a permutation 
g G G yi such that g(z k ) =y k ,2<k<r+l. If p = gf , then p(x x ) = g{f{x x )) = 
g{y\)=y\, and p(x k ) = g(f(x k )) = g{zk) = yt,2 < k < r+ 1, i.e., p G G and 



Let's return to our conjecture based on Inequalities (3.22) and (3.23). Because S3 
is 3-fold transitive, the only way to salvage the conjecture is by replacing the lower 
bound with 5. All right. Suppose, we could prove the modified conjecture. Then, 
what comes after 5? 

3.3.17 Example. Let's see what we get when we average the fourth powers of 
the numbers of fixed points of the permutations in a 4-fold transitive group, e.g., 



Vj = V\{j} 

= {1,2,...,;- 1,;'+ l,...,m}. 



p(x k ) = y k , 1 < k < r + 1. 
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The cycle types of the permuations in 64 are [4], [3, 1], [2 2 ], [2, l 2 ], and [l 4 ]. Permu- 
tations with cycle types [4] and [2 2 ] don't have fixed points. There are 
P(4, 3)/3 = [4 x 3 x 2]/3 = 8 permutations of cycle type [3,1] each of which 
has one fixed point. Permutations of type [2, l 2 ] have two fixed points, and there 
are C(4, 2) = 6 of these. Finally, e 4 has four fixed points. So, 

^E F ^ 4 = ^ 8xl4 + 6 x 24 + 44 ] 

= i[8 + 96 + 256] 

24 1J ' 

If there is a theorem here, it involves the sequence 

1,2,5,15,... 

Amazingly enough, that sequence is familiar. The first four terms, at least, are Bell 
numbers, sums of Stirling numbers of the second kind. 

3.3.18 Theorem. Let G be a permutation group of degree m.Ifl < r < m, then 



the rth Bell number, with equality if and only if G is r-fold transitive. 

Proof. The proof is by induction on r. The r = 1 case having already been estab- 
lished in Corollary 3.3.12, we may assume r > 2. If m — 2, then G = S2 or 
G = {^2}. As the result is easily seen to be valid in both of these cases, we may 
assume m > 3. 

As in the proof of Burnside's lemma, a certain set is counted in two different 
ways. Let 

T = {(g, h , h, ■ ■ ■ , i'r) : g £ G and g(i k ) = i k ,l<k<r}. 

By the fundamental counting principle, F(g) r of the elements of T begin with g. 
Thus, 

o(T) = Y,F(8Y- 

Any element of T that ends with j = i r must begin with a permutation g E Gj. By 
the fundamental counting principle, there are F(g)^ 1 ways to choose the intermedi- 
ate r — 1 entries. Therefore, 

m 

j=i geGj 
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Setting these two different-looking values of o(T) equal to each other produces 

m 

E f fe)' = EE F («r' ( 3 - 24 ) 

g€G J=l g£Gj 

Of course, every g e Gj has at least one fixed point, namely j. Let 
F\(g) = F(g) — 1. Then, for g e Gj,F 1 (g) is the number of fixed points of the 
restriction of g to 

Vj = {l,2,...,j-l,j+l,...,m}. 
Substituting F(g) = F 1 (g) + 1 in Equation (3.24) produces 

m 

yeG j=[ g€Gj 

m r—l 

j=l g€Gj k=0 
m r—l 

= EE c ('-- 1 ^)E f '(^ 

j=l k=0 g€Gj 
m r—l 

>J2o{Gj)J2c{r-l,k)B k 

j= I k=0 
m 

= B r Y,o{G J ) (3.25) 

7=1 

by the binomial theorem, induction, and the Bell recurrence relation (Theorem 
2.2.7). Moreover, by the induction hypothesis, equality holds in Equation (3.25) 
if and only if Gj is (r — l)-fold transitive for all j, if and only if (Lemma 3.3.16) 
G is r-fold transitive. Finally, by Equations (3.20) and (3.21), Y^jL\ = 
to(G) > o(G), with equality if and only if t = 1, if and only if G is transitive. 
Because an r-fold transitive group is transitive, the proof is complete. ■ 

The ability of mathematicians to list all finite doubly transitive permutation 
groups has robbed Theorem 3.3.18 of one application, but there are others. Recall 
the enumeration in Section 2.3 of the permutations in S m that have exactly k fixed 
points: There are C(m, k) ways to choose the numbers to be fixed and D(m — k) 
ways to derange the remaining m — k numbers. Therefore, 

m 

g€S m k=\ 

*See, e.g., P. J. Cameron, Permutation groups, Chapter 12 in Handbook of Combinatorics (R. L. Graham, 
M. Grotschel, and L. Lovasz, Eds., MIT Press, Cambridge, MA, 1995. From the list of doubly transitive 
groups, the r-fold transitive groups (r > 2) can be determined by inspection. In particular, the only 6-fold 
transitive groups are S m (m > 6) and A m (m > 8), where A m is the alternating group of even permutations 
found in Exercise 23, Section 3.2. 
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where D(Q) = 1. If r < m then, from Theorem 3.3.18, 

m * — ' 



S(-^w (3 - 26) 



a formula for the Bell numbers in terms of the derangement numbers. There is 
more. From Equation (2.18) in Section 2.3, D(m — k)/[(m — &)!] = 
E""„*(-l)7ri. Therefore, 

m fe r ^(-l) f 



*-£bW- 



£=1 "■■ t=0 

Since this identity is valid for all m > r, we may as well let m go to infinity.* 
Because 

limg-^'- 1 



u 



it follows that 

1 V 

a formula due to G. Dobinski.^ 

Further applications of Burnside's Lemma depend upon the notion of a symme- 
try group, the topic of the next section. 



3.3. EXERCISES 

1 Let G = {e A , (23), (14), (14)(23)}. Exhibit O x when 
(a) x = 1 (b) x = 2 (c) x = 3. 

2 Let G= ((123)(45)) C S 5 . 

(a) Exhibit G x when x = 3. 

(b) Exhibit O x when x = 3. 

(c) Confirm that o(0 3 ) = o(G)/o(G 3 ). 

This involves questions of convergence, bringing us to the boundary between combinatorics and analysis. 
f G. Dobinski, Grunerfs Arch. 61 (1877), 333-336. 
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(d) Exhibit G 5 . 

(e) Exhibit <9 5 . 

(f) Confirm that o(0 5 ) = o(G)/o(G 5 ). 

(g) Compute the average of the numbers of fixed points of the permutations 
in G. 

3 Let G = {e 4 , (1234), (1432), (13), (24), (12)(34), (13)(24), (14) (23)} C S 4 . 

(a) Prove that the group G is transitive, directly from the definition. 

(b) Prove that G is transitive by averaging numbers of fixed points. 

(c) Is G doubly transitive? 

4 What are the orbits of the cyclic group 

(a) ((123)(45)) c S 5 ? (b) ((123)(45)) C S 6 1 

(c) ((1234)) c S 4 ? (d) ((1234)) c S 8 ? 

5 Average the numbers of fixed points in the cyclic group 

(a) ((123)(45)) c S 5 . (b) ((123)(45)) C S 6 . 

(c) ((1234)) c S 4 . (d) ((1234)) c S s . 

6 Average F(g) 2 as g runs over the cyclic group 

(a) ((123)(45)) c S 5 . (b) ((123)(45)) C S 6 . 

(c) ((1234)) c S 4 . (d) ((1234)) c S s . 

7 Let A 4 = {e 4 , (123), (124), (132), (134), (142), (143), (234), (243), (12)(34), 
(13)(24),(14)(23)}C5 4 . 

(a) Find the number of orbits of A4 using Burnside's lemma. 

(b) Use Inequality (3.23) to show that A4 is doubly transitive. 

(c) Use Theorem 3.3.18 to decide whether A4 is 3-fold transitive. 

8 Confirm the validity of Theorem 3.3.18 when r = m = 2. 

9 Let G be a permutation group of degree m. 

(a) If G is transitive, prove that o(G x ) = o(G y ) for all x, y e { 1 , 2, . . . , m}. 

(b) Prove that G is transitive if and only if the following condition is satisfied: 
For every xe {1,2,..., m}, there exists a permutation p G G such that 
p(l)=x. 

10 Let G — {^2}. Show that Gj is transitive on {1, 2}\{y'} for each j G {1,2} and 
yet G is not doubly transitive. 

11 By a direct computation along the lines of Example 3.3.17, confirm that 

W kY, g esJ{sY = Br, 1 <r<3. 
^ Tk^eSs F (gY = Br,l<r<5. 
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12 By evaluating the right-hand side, confirm that 



(»> B 3 = Ei,EV- (b) B3 = EfEV- 



13 How close is (1/e) ELiO^AO t0 #3 = 5? 

14 Hugh Edgar pointed out that the conclusion of Theorem 3.3.18 does not follow 
without the hypothesis r < m. Show that 



with equality if and only if G = S m . 

16 Denote by D r m the subset of F r m consisting of all P(m, r) one-to-one 
functions. For each p S S m , denote by p : D r „, — > D rm the induced action of 
/? on D r;m defined by p(f) = p of, f e D r „. 

(a) Show that pq = pq for all p, q £ S m . 

(b) Suppose G is a subgroup of S m . Explain why G = {p : p G G} can be 
viewed as a subgroup of S P ^ r y 

(c) Prove that G is an r-fold transitive subgroup of S m if and only if G acts 
transitively on D rm . 

17 Let G be a transitive permutation group of degree m > 1. Prove that G contains 
a derangement. 

18 A permutation group G of degree m is semiregular if G x = {e m } for all 
x<G V = {l,2,...,m}. 

(a) If G is semiregular, prove that o((9 x ) = o(G) for every x <E V. 

(b) If G is a semiregular permutation group of degree m, prove that o(G)|m, 
i.e., that the cardinality of G exactly divides its degree. 

(c) Suppose G is a transitive permutation group of degree m. Prove that G is 
semiregular if and only if o(G) = m. (A transitive semiregular permutation 
group is said to be regular.) 

19 Let G be an r-fold transitive permutation group of degree m. Prove or disprove 
that P(m, r)\o(G), i.e., that o(G) is some integer multiple of the product 
m(m — 1) ■ ■ ■ (m — r + 1). 



(b) ^E S6S5 ^) 6 = B6-1. 



15 Let G be a subgroup of S m . Prove that 
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20 Let G be a permutation group of degree m. Suppose x,y G V = {1,2, ... ,m}. 
Define H = G x , the stabilizer subgroup of x. Then, both G and H partition V 
into a disjoint union of orbits. For the purposes of this exercise only, denote by 
Gx (not to be confused with G x ) the orbit of G to which x belongs and by Hy 
the orbit of H = G x to which y belongs. Prove that 

o{G) = o{Gx)o{Hy)o{H y ), 
where H y = { p G G x ■ p(y) = y} = { p € G : p(x) = x and p(y) = y}. 



3.4. SYMMETRY GROUPS 

Permutation groups arise naturally in discussions of symmetry. Imagine the square 
in Fig. 3.4.1a drawn on a sheet of plain paper which is then passed through a copy 
machine to produce an overhead projection transparency. If the transparency were 
aligned on top of the paper, so that the two squares were superimposed, you would 
see what appeared to be a single square. However, if the point of a compass were 
placed at the intersection of the diagonals of that square and (just) the transparency 
rotated 36 degrees in the clockwise direction, you would see two overlapping 
squares. Therefore, a 36° clockwise rotation is not a symmetry of the square. 
Had the transparency been rotated exactly 90°, the squares again would be super- 
imposed, and again you would see what appeared to be just one square. Thus, a 90° 
clockwise rotation is a symmetry of the square. 

It would be useful to have a list of the different symmetries of a square. This 
requires us to be a little more precise about what we mean by a symmetry and a 
lot more precise about what we mean by different. 

Suppose the vertices of the square in Fig. 3.4.1a are numbered, as shown in 
Fig. 3.4. lb. Never mind that a 90° rotation is not a symmetry of the labeled figure. 
The labels are only there to facilitate our discussion. While they rotate with 
the square, they are not part of it. (Since we are imagining things anyway, feel 
free to imagine that the numbers are transparent. 

A 90° clockwise rotation acts as a permutation of the vertices. Vertex 1 is sent to 
the position formerly occupied by vertex 2, vertex 2 goes to the place previously 
held by vertex 4, and so on. It seems natural to associate the permutation p = 
(1243) with a 90° clockwise rotation. 



1 2 

O O O O 



O O O O 

3 4 
(a) (b) 



Figure 3.4.1 
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1 ! 2 1 2 1 2 1 2 

O ' O O O O p Q, O 



O ! O O O O- O O -O 

3 | 4 3 43 43 4 

(12) (34) (13) (24) (14) (23) 

Figure 3.4.2 



What about a 90° counterclockwise rotation? That corresponds to q = (1342), 
the same permutation associated with a clockwise rotation of 270° ! To be a symme- 
try, what matters is where the figure ends up, not the route it took getting there. Two 
symmetries are the same if and only if they afford the same permutation. A 90° 
counterclockwise rotation and a 270° clockwise rotation are different geometric 
routes to the same symmetry. 

Because each symmetry of the square corresponds to a unique permutation of its 
vertices, we may as well use permutations as convenient descriptive names for sym- 
metries. (Be careful. This discussion is taking place in the context of some fixed but 
arbitrary numbering of the vertices. While the symmetries may not depend on these 
numbers, their permutation names will.) 

Just four symmetries come from rotating the square around the compass point, 
i.e., about an axis through its center, perpendicular to the square. They are (1243), 
(1342), (14) (23), and e 4 . (The 360° rotation and the 0° rotation are two routes to 
the symmetry whose permutation name is e 4 .) Four more symmetries arise from 
rotations about axes that lie in the plane of the square. (See Fig. 3.4.2.) 

With respect to the vertex numbering of Fig. 3.4.1a, the set of all symmetries of 
the square is 

D 4 = {e 4 , (1243), (14)(23), (1342), (12) (34), (13)(24), (14), (23)}. (3.28) 

Many remarkable things can be said about D 4 , none of which address the question 
that seems to be foremost in people's minds. Let's deal with that issue first. Why is 
it called D 4 1 Here are some responses: (1) It had to be called something; (2) "D„" 
is the name traditionally given to the symmetries of the regular n-gon; (3) "D" 
stands for dihedral, a name that someone once must have thought was descriptive. 

Let's talk substance. Perhaps the most obvious substantive thing to be said about 
D 4 is that it contains 8 permutations. Only one-third of the 24 permutations in S 4 are 
symmetries of the square. The (painful) effect of applying the permutation (12) to 
the hapless square is illustrated in Fig. 3.4.3. 

What if two symmetries are performed in succession? From the geometric 
perspective, this process is easy to understand. Following a symmetry with a sym- 
metry produces another symmetry. So far, so good. But which one? How is the 
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unique permutation that describes a combination of symmetries related to their 
individual permutation descriptions? 

3.4.1 Example. Suppose we follow 90° clockwise rotation p = (1243) with q, a 
180° rotation about an axis through the lower left and upper right-hand corners of 
the square. Which of the eight elements of D 4 describes the combined symmetry? 






Vertex 1 is sent by p to the position formerly occupied by vertex 2, a location on 
the axis of rotation of q = (14). Since q fixes a vertex in that position, the combi- 
nation of p followed by q sends 1 to 2. Note that it is not the number of vertex 1 that 
determines where it is sent by symmetry q = (14); it is the number of the position 
that vertex 1 occupies when symmetry q is applied. 

Vertex 2 is sent by p to vertex 4's original position, and symmetry q sends a ver- 
tex in that place to vertex l's initial position. Therefore, p followed by q sends 2 to 
1. Evidently, (12) is a cyclic in the disjoint cycle factorization of the combined 
symmetry. 

Vertex 3 is sent by p to the initial position of vertex 1 , and q sends a vertex in that 
place to the position originally occupied by vertex 4. Finally, the combined sym- 
metry sends 4 to 3. So, the combination, first p = (1243), then g = (14), 
yields (12)(34), a 180° rotation about an axis through the midpoints of sides 1 — 2 
and 3^1. □ 



The most remarkable thing about the process of describing the combined 
symmetry, first p then q, is that it is identical to the process for computing the 
composition qp, i.e., (14) o (1243) = (12)(34). Let's formalize this discovery. 



3.4.2 Theorem. Let p and q be symmetries of some object F. Then the 
permutation afforded by the combined symmetry first p then q is the composition qp. 

When we elected to use permutations as convenient descriptive names for 
symmetries, there was no reason to believe that a combination of symmetries would 
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Figure 3.4.4. The numbered faces of a die. 



have any connection at all to the composition of their corresponding permutation 
names. This unexpected relationship has some profound consequences. For one 
thing, the set of permutations representing the family of all symmetries of an object 
is closed. In particular, D4 is more than a subset of S4, it is a subgroup. 

3.4.3 Definition. Let G be a subgroup of S m . If it is possible to label some object 
F in such a way that every element of G is a symmetry of F, then G is a symmetry 
group. 

Among the symmetries of the square are those that can be achieved under the 
constraint that the superimposed transparency remain flat on top of the original. 
More generally, a plane symmetry is one that can be performed entirely within 
the two-dimensional plane. The plane symmetries of the square comprise a symme- 
try group, namely ((1243)) = {e 4 , (1243), (14)(23), (1342)}. Ironically, these 
are the symmetries that can be described as rotations about an axis perpendicular 
to the plane, while the remaining, nonplanar symmetries can all be construed as 
rotations around axes in the plane. (Nonplanar symmetries can also be visualized 
as reflections.) 

Let's consider a real-life example, the cube. It is conventional in "the real 
world" to number, not the vertices, but the faces of cubes. The standard way to 
number dice is illustrated in Fig. 3.4.4. 

How many symmetries does a cube have? Let's begin with an analogy. The 
square is a two-dimensional form lying in the plane. It seemed natural to partition 
the symmetries of the square into two types, planar and nonplanar. The cube is a 
three-dimensional figure. Its symmetries naturally split between those that can be 
accomplished entirely within three-dimensional space, and those that cannot. The 
three-dimensional symmetries are all rotations (of the kind taking place 24/7 in 
gambling casinos from Atlantic City to Las Vegas). The remaining symmetries 
are reflections. 



* Just as the nonplanar symmetries of the square can be visualized as rotations through a third dimension, 
reflections of the cube can be construed as rotations through a fourth dimension. But, we will not make use 
of this idea. For us, a rotational symmetry of the cube is a three-dimensional rotation. 
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Left count the rotational symmetries of a cube. Any one of the six numbered 
faces of a die can be rotated to the top. Holding the top and bottom faces with 
your forefinger and thumb, any one of the four remaining faces can be rotated to 
the front. Once the top and front faces are specified, the remaining faces are com- 
pletely determined. So, a cube has 6 x 4 = 24 different orientations and, hence, 24 
different rotational symmetries. 

3.4.4 Example. Let's say a die (numbered as in Fig. 3.4.4) is in standard 
position if face 1 is on top and face 2 is in front. (Then face 6 is on the bottom, 
and face 5 is in the back.) With the die in standard position, hold the bottom and 
top faces with your thumb and forefinger and rotate face 2 to the left 90° (that is, 
clockwise when viewed from the top, looking down on face 1). As face 2 moves, 
other faces move too. The faces around the "equator" all rotate to new locations. 
While the squares comprising faces 1 and 6 "experience" a symmetry, they wind 
up in their original positions. The permutation name for this symmetry is (2453). 

Here is another example. (Try to get your hands on a die for this one.) Place your 
forefinger on vertex {1,2,3} (at the intersection of faces 1, 2, and 3) and your 
thumb on vertex {4,5,6}. Rotate face 1 into the position formerly occupied by 
face 3. This time, all six faces change position. The resulting symmetry is 
(132)(456). 

The complete rotational symmetry group of the cube is shown in Fig. 3.4.5. 

□ 

Perhaps it is inconsistent to describe the symmetries of a square as permutations 
of its vertices and the symmetries of a cube as permutations of its faces. Why not 
view the symmetries of a cube as vertex permutations? What difference would it 
make? The symmetries themselves are independent of whether we describe them 
in terms of faces or vertices, or edges for that matter. One practical sort of differ- 
ence is that, as permutations of the faces, the symmetries of the cube comprise a 
subgroup of S 6 . As vertex permutations, they form a subgroup of Sg, and as edge 
permutations, they constitute a subgroup of S\2- 

There is some nice geometry associated with expressing the rotational symme- 
tries of a cube as permutations of the vertices. Imagine two congruent, square-based 
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Figure 3.4.5. The rotational symmetry group of the cube. 

*As abstract groups, these manifestations of the rotational symmetry group of the cube are all isomorphic 
to S 4 . 
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Figure 3.4.6. A regular octahedron inside a cube. 



pyramids. The object that results from gluing the square bases together, so that they 
disappear into the interior, is an octahedron. So, an octahedron is a polyhedron with 
8 triangular faces, 6 vertices (each surrounded by four faces), and 12 edges. If each 
of the faces is equilateral, the octahedron is said to be regular. 

From Fig. 3.4.6, one sees that a regular octahedron will fit inside an appropri- 
ately sized cubical box in such a way that each vertex of the octahedron is aligned 
with the center of the corresponding face of the box (and each vertex of the cube is 
centered directly above a face of the octahedron). If one of the 24 rotational sym- 
metries of the cube is applied (gently) to the box, the result is also a symmetry of 
the octahedron inside. In other words, every rotational symmetry of the cube is 
simultaneously a rotational symmetry of the regular octahedron (and vice versa). 
The cube and the regular octahedron share the same rotational symmetry group! 
The manifestation of this group as permutations of the eight vertices of the 
cube is identical to its manifestation as permutations of the eight faces of the octa- 
hedron. Indeed, this group is commonly known to mathematicians as the octahedral 
group. 

While this discussion is pleasant enough, it doesn't seem to be getting us any 
closer to a concrete realization of the octahedral group as a subgroup of S&. As a 
step in that direction, let's agree to number the vertices of a die as follows: 

1 = {1,2,3}, 2={1,2,4}, 3 = {1,3,5}, 4 = {1,4,5}, 

5 = {2,3,6}, 6 = {2,4,6}, 7 = {3,5,6}, 8 = {4,5,6}, ^> 

where, e.g., 6 = {2, 4, 6} means that (boldface) number 6 is assigned to the vertex 
formed by the intersection of the even-numbered faces. 
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If some symmetry of the cube, manifested as a permutation of its six faces, 
corresponds to p E S(, then, as a permutation of the eight vertices of the cube, 
that same symmetry corresponds to the permutation p S Ss defined by 

P({iJ,k}) = {p(i),p(j),p(k)}. 
We will say that p is induced by p. If, e.g., p = (25) (34) € 5*6, then 

=J p({l,2,3}) 

= {p(l),p(2),p(3)} 

= {1,5,4} 

= 4. (3.30) 

Similarly, 

p(4) =/>({!, 4. 5}) 

= {p(l),p(4),p(5)} 
= {1,3,2} 

= 1. (3.31) 

Evidently, (14) is a cycle in the disjoint cycle factorization of p. In the same way, 
p(2) = {p(l),p(2),p(4)} = {l,5,3} = 3, p(3) = { P (l) lF (3), p(5)} = {1,4,2} - 
2, and so on. Continuing in this way, we find that p — (14) (23) (58) (67). Notice that 
o(p) = 2, as it should. Because p and p = (25) (34) represent the same symmetry, 
they have the same order. 

3.4.5 Example. As a permutation of die numbered faces, p = (2453) G 56 is a 
rotational symmetry of the cube. Before describing its induced action, observe that 
the least common multiple of the lengths of the disjoint cycles of p is 
o(p) = o{p) = 4. Therefore, every cycle of p has length 2 k , where 0 < k < 2. 
Moreover, at least one cycle of p must have length equal to 4. Let's confirm these 
deductions: 

p(l) =p({l,2,3}) = {p(l),p(2),p(3)} = {1,4,2} = 2, 
p(2) =p({l,2,4}) = {p(l),p(2),p(4)} = {1,4,5} = 4, 
p(4) =p({l,4,5}) = {p(l),p(4),p(5)} = {1,5,3} = 3, 
p(3) =p({l,3,5}) = {p(l),p(3),p(5)} = {1,2,3} = 1. 

So, (1243) is a cycle of p. Beginning a new cycle with 5, 

p(5) = p({2, 3, 6}) = {p(2),p(3),p(6)} = {4, 2, 6} = 6, 
p(6) = p({2, 4, 6}) = {p(2),p(4),p(6)} = {4, 5, 6} = 8, 
p(8) = p({4, 5,6}) = {p(4),p(5),p(6)} = {5, 3, 6} = 7, 
p(7) = p({3, 5,6}) = {p(3),p(5),p(6)} = {2, 3, 6} = 5. 



So, p= (1243) (5687). 
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Figure 3.4.7. Two manifestations of the octahedral group. 



For each rotational symmetry p, manifested as a permutation of the (die num- 
bered) faces of a cube, the corresponding induced vertex permutation p can be 
found in Fig. 3.4.7. (Note that p x and p 2 can have the same cycle structure even 
when p\ and pi do not.) □ 



3.4.6 Example. Whatever its manifestation, the octahedral group G contains 
only some of the symmetries of the cube, namely, the 24 rotations. What about 
reflections? Suppose a die in standard position (with face 1 on top and face 2 in 
front) is laid on a mirror. Imagine the image rising straight up out of the mirror until 
it is superimposed on the die, with face 6 of the reflection overlapping face 1 of the 
die, face 1 of the reflection overlapping face 6 of the die, and the remaining faces of 
the image overlapping the correspondingly numbered faces of the cube. As a per- 
mutation of the faces, this reflection is r = (16) <G Se- (Note, e.g., from Fig. 3.4.5, 
that r G.) 

Given one reflection, it is easy to generate more. If p S G is any rotational sym- 
metry, then the composition q = pr is a symmetry. Might q be a rotation? If so, then 
r = p~ l q S G, a contradiction. Since it cannot be a rotation, pr must be another 
reflection. Because p\r = p 2 r if and only if p\ = pi, the set Gr = {pr : p G G} 
contains 24 different reflections. Moreover, since the die and its reflected image 
rotate together, Gr contains all possible reflections, i.e., H = G U Gr is the (full) 
symmetry group of the cube. As permutations of its six faces, all 48 symmetries 
of the cube are given in Fig. 3.4.8. □ 
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Figure 3.4.8. The 48 symmetries of the cube. 



3.4. EXERCISES 

1 Suppose the vertices of the square in Fig. 3. 4.1b are permuted according to the 
permutation p = (13) G S4. Draw a picture of the resulting "twisted" polygon. 

2 Let R be a rectangle of length 5 and width 3. Consecutively number its vertices 
1-4 in clockwise order. 

(a) Use this numbering to write down the group of symmetries of R as 
permutations of its vertices. 

(b) How would the group in part (a) differ from the group of symmetries of R 
as permutations of its edges? 

(c) How sensible is it to discuss the symmetries of R as permutations of its 
face? 

3 Denote by D 3 the group of symmetries of an equilateral triangle as permutations 
of its vertices. 

(a) Show that D 3 = S3. 

(b) Which of the six symmetries are plane symmetries? 

4 How many symmetries does an isosceles right triangle have? How many of 
them are plane symmetries? 

5 Suppose the vertices of a regular pentagon are consecutively numbered 1-5, in 
clockwise order. Use this numbering to exhibit 

(a) the group of plane symmetries of the pentagon. 

(b) D 5 , the group of all 10 symmetries of the pentagon. 



3.4. Exercises 



215 



6 Suppose the vertices of a regular hexagon are consecutively numbered 1-6, in 
clockwise order. Use this numbering to exhibit 

(a) the group of plane symmetries of the hexagon. 

(b) Ds, the group of all symmetries of the hexagon. 

7 Denote by D n the group of all symmetries of the regular n-gon. Prove that 

o(D n ) = 2n, n > 3 

8 Recall (Example 3.4.4) that a die is in standard position if its top face is 
numbered 1 and its front face is numbered 2. The symmetry (1265) might be 
described, in words, as a 90° rotation around an axis through the centers of 
faces 3 and 4. Similarly, (123) (465) is a 120° rotation around an axis running 
diagonally through the cube from vertex {1,2, 3} to vertex {4, 5, 6}. Describe, 
in words, the symmetry 



9 A regular tetrahedron is a pyramid with a triangular base in which each of the 
four triangular faces is equilateral. Assign numbers \-A to the faces of a 
regular tetrahedron in some fixed but arbitrary way. 

(a) Prove that a regular tetrahedron has 12 rotational symmetries. 

(b) Exhibit the rotational symmetries of a regular tetrahedron as a permutation 
group of degree 4. 

10 Prove that the group of all symmetries of a regular tetrahedron is S4. (See 
Exercise 9.) 

11 The 24 rotational symmetries of a cube expressed as permutations of its 
vertices can be found in Fig. 3.4.7 (in the columns labeled p). Express the 
remaining 24 symmetries (the reflections) as permutations of the vertices. 
(Hint: Example 3.4.6.) 

12 Express the 12 rotational symmetries of a regular tetrahedron (see Exercise 9) 
as permutations of its six edges. (Hint: An edge is formed by the intersection 
of two faces. Unlike a cube, every pair of faces of a tetrahedron meet to form 
an edge. Number the edges in dictionary order, i.e., 1 = {1,2},2 = 
{1,3}, 3= {1,4}, 4 = {2,3}, 5= {2,4}, and 6 = {3,4}. Let G be the group 
of rotational symmetries as permutations of the four faces. For each p £ G, let 
p be the natural action induced on the edges, i.e., p({i,j}) = {p(i),p(j)}- 
Express G = {p : p € G} as a subgroup of Sg.) 

13 Suppose the vertices of a square are numbered, not as shown in Fig. 3.4. lb, but 
in consecutive clockwise order. With respect to this numbering scheme, the 
permutation names of the elements of D4 will not be the same as those given in 



(a) (16) (25). 
(c) (16) (24) (35). 
(e) (1463). 



(b) (16) (34). 
(d) (16) (23) (45). 
(f) (154) (236). 
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Equation (3.28). Exhibit their permutation names with respect to this con- 
secutive clockwise numbering scheme. 

14 Let G be the rotational symmetry group of the cube expressed as permutations 
of its (die numbered) faces. For each p G G, let p be the corresponding vertex 
permutation. If G = {p : p G G}, define / : G — > G by f(p)=p. (See 
Fig. 3.4.7.) 

(a) Prove that =f(p)f(q),p,qe G. 

(b) Deduce that/t/T 1 ) =f(p)~\peG. 

15 Prove that D 4 (Equation (3.28)) is transitive but not doubly transitive 

(a) from the definitions and geometric considerations. 

(b) using Equations (3.22) and (3.23). 

16 Prove that the octahedral group (Fig. 3.4.5) is transitive but not doubly 
transitive 

(a) from the definitions and geometric considerations. 

(b) using Inequalities (3.22) and (3.23). 

17 In general, a polyhedron is regular if each of its faces is congruent to the same 
regular polygon and each of its vertices is formed by the intersection of the 
same number of faces. The cube, regular tetrahedron, and regular octahedron 
are examples of regular polyhedra. (If two regular tetrahedra are glued 
together so as to make a face of each disappear into the interior of the 
resulting figure, the outcome is not a regular polyhedron because some vertices 
are formed by the intersection of three faces and some by four.) The regular 
dodecahedron, illustrated in Fig. 3.4.9a, is a regular polyhedron each of whose 
12 faces is a regular pentagon. 

(a) Prove that a regular dodecahedron has 20 vertices and 30 edges. 

(b) Prove that a regular dodecahedron has 60 rotational symmetries. 

18 Afullerene* is a pure carbon molecule, C„, in which the n carbon atoms sit at 
the vertices of a polyhedral "cage" whose faces consist of 12 pentagons and 
\n — 10 hexagons. The first fullerenes, Ceo and C 70 , were isolated in 1990. The 
smaller version, C6o, is in the shape of a (traditional) soccer ball. Also known 
as a truncated icosahedron, each vertex of a soccer ball lies at the intersection 
of two hexagonal faces and one pentagonal face. (See Fig. 3.4.9ft.) 

(a) Compute the number of hexagonal faces of C6o- 

(b) Compute the number of edges of a truncated icosahedron. 

(c) Compute the number of rotational symmetries of a truncated icosahedron. 

(d) In what sense does a truncated icosahedron fail to be a regular 
polyhedron? (See Exercise 17.) 

*Named for R. Buckminster Fuller (1895-1983). 
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(a) (b) 

Figure 3.4.9. (a) A regular dodecahedron; (b) a truncated icosahedron. 



19 Prove that no regular polyhedron (see Exercise 17) has hexagonal faces. 

20 Prove that there are exactly five regular polyhedra. (Hint: Exercise 19.) 

21 Leonhard Euler proved that if a convex polyhedron has F faces, E edges, and V 
vertices, then F + V = E + 2. Confirm Euler's formula for a 

(a) cube. (b) tetrahedron. 

(c) octahedron. (d) square-based pyramid. 

(e) truncated icosahedron (see Exercise 18). 

22 Having six square faces and eight equilateral triangular faces, a cuboctahedron 
is carved from a cube by truncating (slicing off) each vertex with a plane that 
passes through the midpoints of the three edges incident with it. Every edge 
of a cuboctahedron has the same length, namely 1 / y/2 times the length of an 
edge of the original cube. 

(a) Confirm Euler's formula (Exericse 21) for the cuboctahedron. 

(b) Discuss the symmetries of a cuboctahedron. 



Its counterpart, the truncated octahedron, has 8 regular hexagonal faces and 6 equilateral triangular faces. 
William Thomson, Lord Kelvin (1824-1907), proposed the truncated octahedron as the shape of a space- 
filling cell that minimizes the ratio of surface area to volume. In 1994, D. Weaire and R. Phelan discovered 
another cell with 14 faces that improves on Lord Kelvin's by 0.3%. It is not known whether this new cell is 
optimal. 
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Figure 3.5.1 



3.5. COLOR PATTERNS 



A mathematician, like a painter or a poet, is a maker of patterns. 

— G. H. Hardy 

Let's take some of the materials left lying around from our last discussion, e.g., 
squares and cubes, and recycle them into decorations for Independence Day. We 
might, e.g., take a square and color its vertices red, white, or blue. With respect 
to the vertex numbering of Fig. 3.5.1, any such coloring can be identified with a 
unique function/ : { 1, 2, 3, 4} — > {r, w, b}. Some colorings, along with the match- 
ing functions, are given in Fig. 3.5.2. 

Surely, it would be going too far to claim that there is room for "artistic expres- 
sion" in decorating squares. Is there room even for some individuality? How many 
different colorings are there? Because coloring the vertices of a square involves four 
decisions, each having three choices, there must be 3 4 = 8 1 different colorings. The 
set C, consisting of all functions/ : {1,2,3,4} — > {r,w,b}, contains 81 elements. 

Wait a minute. Look carefully at the four colorings illustrated in Fig. 3.5.2. How 
different will they be after the paint dries and the squares are free to rotate? It 
seems 81 is the right answer to the wrong question. Let's try to formulate the right 
question. 

Say two colorings (elements of Q are equivalent if one can be obtained from the 
other by a plane rotation of the square. This relation partitions C into equivalence 
classes; let's call them color patterns. The four colorings in Fig. 3.5.2, e.g., com- 
prise a single color pattern. The right question is, how many color patterns are 
there? 
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fi = (r,w,r,b) h = (r,r,b,w) f 3 = (b,r,w,r) 

Figure 3.5.2 



f 4 = (w,b,r,r) 
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Before we can count color patterns, we need to understand the relation a little 
better. Consider, e.g.,/i and/2 in Fig. 3.5.2. Geometrically, the coloring fa can be 
obtained from f\ by a 90° rotation. With respect to the vertex numbering of Fig. 
3.5.1, this is the symmetry whose permutation name is p = (1243). However, func- 
tion fi ^ p 0/1. In fact, pf\ is meaningless. The image of/i is a set of colors. It is no 
subset of domain(/?) = {1,2,3,4}. The composition of p and f\ makes sense, but 
only in the order f\p. Well, maybe f\p — fz- Let's see: 

/ip(l)=/i(Ml))=/i(2)=w, 
f 1 p(2)=f 1 ( P (2))=f 1 (4)=b, 
/^(3)=/i(p(3))=/i(l) = r, 
/ip(4)=/i(p(4))=/ 1 (3) = r. 

So,f\p — (w, b, r, r) =fa, not/2. The correct combination of/ J2, and p is (confirm 
it!) 

fi=fiop- 1 . (3.32) 

When a fixed but arbitrary symmetry q S D4 is applied to an /-colored square, 
another coloring is produced, namely, the one corresponding to fq~ l . This is inter- 
esting. Associated with each symmetry of the square is a permutation of colored 
squares, i.e., permutation q S S4 acts on the 81 -element set C. It's almost as if q 
were a permutation in S$i- Let's explore this idea more generally. 

3.5.7 Definition. Denote by C mj „ (not to be confused with C(m, «)) the set of all 
functions 

/ : {1,2, . . . ,m} -> {xi,x 2 , . . .,*„}. 
The action of p £ S m induced on C m „ is defined by 

Kf)=f°p- 1 , feC m , n . (3.33) 

A couple of comments may be in order: (1) There is no mathematical reason to 
introduce C m ,„. It is a clone of F m ^ n , the set of all functions from {1,2, ... ,m} into 
{1,2, ... ,n}. However, thinking of the elements of C m; „ as colorings may make the 
mathematicians easier to understand. (2) While p (in Equation (3.33)) is similar to p 
(from Section 3.4), p and p are not clones. They are two different induced actions of 
P G S m . 

3.5.2 Lemma. For any permutation p € S m , the function p : C m>n — > C m> „ is 
one-to-one and onto. Moreover, if p,q G S,„, then 



pq=pq. 



(3.34) 
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Proof. Suppose f,g G C m ,„. Then p(f) = p(g), if and only if fp 1 = gp l , if and 
only if / = g, proving that p is one-to-one. Because p : C m .„ — > C m ,„, and C m> „ is 
finite, p is onto. Finally, for a fixed but arbitrary / G C m ,„, 

pq(f)=P(q(f)) 
= p(fq~ l ) 
= {fq- X )p- X 
= f(q- l p~ l ) 
= f(pq)- 1 

= M(f). u 

This brings us to a matter of "national security." For the rest of this section, 
information will be restricted on a need-to-know basis. From Lemma 3.5.2, p is a 
permutation acting on C m . n . If the functions in C mfl are numbered, from 1 to n m , 
then all p needs to know are the numbers of the agents being permuted; p does 
not need to know their true identities. This little metaphor is leading to another 
abuse of language, namely, that p may as well be viewed as an element of S m n . 

Setting x\ = r, x-i = w, and xj, = b allows C43 to be identified with C, the set of 
red-white-blue vertex colorings of the square. Let R = {^4, (1243), (14)(23), 
(1342)} be the symmetry group of plane rotations of the square (with respect 
to the vertex numbering of Fig. 3.5.1), and define R = {p : p G R}. Then, by 
Lemma 3.5.2 and our national security metaphor, R may be regarded as a subset 
of 58i. In fact, R is a subgroup. If p, q <G R, then, by Equation (3.34), pq € R, prov- 
ing that R is closed. 

Suppose/, g G C = C43. As colorings, / and g are equivalent if and only if g can 
be obtained from /by a rotation of the square. Translating this statement into func- 
tion language, / and g are equivalent if and only if there is a symmetry p G R such 
that g —fp^ 1 , if and only if there is a p e R such that p( f) = g, if and only if 
(viewed as elements of SgO /and g are equivalent modulo R. 

Evidently, this artificially contrived R affords another way to state the problem. 
How does it bring us any closer to a solution! In fact, R is not so much artificially 
contrived as artfully crafted. Having identified color patterns with orbits of R, we 
can use Burnside's lemma to count them! The number of color patterns is the aver- 
age of the numbers of fixed points of the permutations in R. Because o(R) = o(R) 
and it doesn't matter whether we sum over p G R or p G R, the number of color 
patterns is 

-lyF(p) = i^r f (p). (3.35) 



It remains to evaluate F(p). 

If p = (1243) G R, then p is the permutation name for a 90° clockwise rotation 
of the square in Fig. 3.5.1 and / G C43 is a fixed point of p if and only if p(f) =/, if 
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and only if the function f = fp~ l , if and only if the coloring f is unchanged when the 
square is turned 90°, if and only if p is a symmetry of the colored square. But, the 
only colored squares left unchanged by a 90° rotation are those in which all four 
vertices are colored the same. Because there are three colors, there are three such 
colorings. In other words, if p = (1243), then F(p) = 3. The same analysis applies 
to p — (1342), the permutation name for a 90° counterclockwise rotation. 

What about p = (14) (23). A rotation of 180° switches vertex 1 with vertex 4 and 
vertex 2 with vertex 3. Thus, p(f) = f if and only if/(l) =/(4) and/(2) =/(3). In 
this case, counting the fixed points of p involves two decisions, one for each cycle 
of p. (That's right, p.) Because there are 3 choices for each decision, p(f) =f for 3 2 
colorings / S C, i.e., F(p) = 9. 

An algorithm is emerging. If p is the permutation name for a generic symmetry, 
then the vertices whose numbers belong to a cycle of p are cycled among them- 
selves. A necessary and sufficient condition for / to be a fixed point of p is that / 
be constant on the vertices within each cycle of p. If the disjoint cycle factorization 
of p contains a total of c(p) cycles (including cycles of length 1), then 3 e ^ color- 
ings meet this criterion, i.e., F{p) — 3°( p \ (If there were 4 colors, F(p) would be 
4 c (p)\ 

Let's try this new algorithm on the remaining element of G, namely e\. Because 
c{en) = 4, we are predicting that F(en) = 3 4 = 81, and that's right. After all, e\ is 
being identified with e%\, a permutation with 81 fixed points. 

Substituting these values for F(p),p <G R, into Equation (3.35) yields that the 
number of inequivalent red-white-blue vertex colorings of a square is |(81+ 3 + 
9 + 3) = 24. In other words, the 81 colorings of C = C43 are partitioned by the 
plane symmetries of the square into 24 patterns. Symbolically, 

C 4 ,3=PlU? 2 U---U?24, 

where 

Pi = {P(g) --peR} 

= { g p~ l :peR} 
= {gp:peR} 

for any coloring g G P,. Moreover, because P, = O g , the orbit of R to which g 
belongs, o(P,) = o(O g ) is the quotient of o(R) and the cardinality of the stabilizer 
subgroup R g . (See Lemma 3.3.7.) 

Amazing! But, is 24 really correct? It has the virtue, at least, of being an integer. 
But would you stake your life on its being the right integer? What about confirming 
it with a brute-force list? 

A system of distinct representatives (SDR) for the color patterns consists of one 
coloring from each pattern. Imagine searching for a SDR for the color patterns of 
red-white-blue vertex colored squares and arriving at the list displayed in Fig. 3.5.3. 
(Convince yourself that no two listed colorings are equivalent, modulo a plane 
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Figure 3.5.3 



rotation.) Assuming ignorance or doubt about the total number of patterns, the only 
way to be sure the list is complete is to confirm that each of the remaining 
81 — 24 = 57 colorings is equivalent to one of those listed. On the other hand, 
given that the total number of color patterns is 24, once 24 inequivalent colorings 
are found, the list must be complete. 

We've been treating these Independence Day decorations as if they were colored 
squares on plain paper. What about coloring squares on transparencies so that, in 
addition to plane rotations, the colored squares can be flipped over? Because this 
changes the symmetry group, it probably changes the number of color patterns, but 
by how much? What would you guess is the number of color patterns modulo 
D 4 = {e 4 , (1243), (14)(23), (1342), (12)(34), (13) (24), (14), (23)}? Does doubling 
the symmetry group halve the number of color patterns? Let's see. By Burnside's 
lemma, 

= ^(81+3 + 9 + 3 + 9 + 9 + 27 + 27) 
8 

= 21. 

This explains the braces in Fig. 3.5.3. They indicate which patterns, inequivalent 
modulo R, coalesce to form single patterns modulo D4. 
Let's extend these notions to a more general setting. 

3.5.3 Lemma. Let c{p) be the total number of cycles, including cycles of length 
1, in the disjoint cycle factorization of p <E S m . Denote the induced action of p on 
Cm } „ by p. Then the number of fixed points of p is F{p) — n c ^ p \ 

Proof If/ G C m ,„, then p(f) = fp~ x =/, if and only \ifp~ x (i) =/(i), 1 < i < m, 
if and only if /(/) =fp(i), 1 < i < m, if and only if f(i) —f(j) whenever i and j are 
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Figure 3.5.4 



in the same cycle of p. So, the number of fs fixed by p is equal to the number of 
ways to make a sequence of c(p) decisions each having n choices. ■ 

3.5.4 Theorem. Suppose G is a permutation group of degree m. Let 
G = {p '■ p € G} be the induced action of G on C m „. Then equivalence modulo 
G partitions C mj „ into a disjoint union of color patterns. The number of patterns is 

'=^S"'"' (336) 

where c(p) is the total number of cycles (including cycles of length 1) in the disjoint 
cycle factorization of p. 

Proof. The result is an immediate consequence of Lemma 3.5.3 and Burnside's 
lemma. 11 



3.5.5 Example. Suppose each face of a cube is painted red, white, or blue. 
There are 3 6 = 729 ways to do it. Say two colored cubes are equivalent if one of 
them can be rotated so as to appear identical to the other one. Let's use Theorem 
3.5.4 to count the resulting color patterns. Imported from Fig. 3.4.5, the octahedral 
group of rotational symmetries of the cube is exhibited in Fig. 3.5.4. Letting 
this group play the role of G in Equation (3.36) yields (don't forget the invisible 
1 -cycles) 



^ [3 6 + 3 x 3 4 + 6 x 3 3 + 6 x 3 3 + 8 x 3 2 ] = 1368/24 

= 57. 



Consider the colorings/i = (r, b, w, w, b, r)Ji — (r, r, b, w, b, w), and/3 = {r, w, 
b, w, b, r) exhibited in Fig. 3.5.5. In each of these colorings, two faces are red, two 
are white, and two are blue. Because the white faces are opposite each other in f\ 
but adjacent in/3, these two colorings are inequivalent. Moreover, because the red 
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/j = (r,b,w,w,b,r) f 2 = (r,r,b,w,b,w) 

Figure 3.5.5 



/ 3 = (r,w,b,w,b,r) 



faces are adjacent in/2, it is equivalent neither to/i nor to fo. Thus, we have distinct 
representatives for three of the 57 color patterns. While it is helpful to know there 
are (only) 54 patterns to go, it would help even more to know the color distributions 
of the remaining patterns. How many more patterns, e.g., are comprised of color- 
ings that have two red faces, two white faces, and two blue faces? That kind of 
information comes from a refinement of Theorem 3.5.4 known as Polya's theorem, 
the subject of the next section. □ 



3.5. EXERCISES 

1 Suppose four colors are available to color the vertices of a square, say red, 
white, blue, and yellow. 

(a) Find g = p(f) if p = (14) (23) and/ = (r,w,b,y). 

(b) Find g = p(f) if p = (1243) wif=(r,w,b,y). 

(c) Suppose g = (r, r, w, b) G P, where P is one of the red-white-blue-yellow 
color patterns modulo the group of plane rotations of the square. With 
respect to the vertex numbering of Fig. 3.5.1, list all the elements of 

P C C 4 , 3 . 

(d) Suppose g = (r, r, w, b) G P, where P is one of the red-white-blue-yellow 
color patterns modulo D4. List all the elements of P C C4 3. 

(e) How many red-white-blue-yellow color patterns are there modulo the 
group of plane rotations? 

(f) How many red-white-blue-yellow color patterns are there modulo D4I 

2 Suppose just two colors are available to decorate the vertices of a square, say 
red and white. Counting the distinct representatives in Fig. 3.5.3 that don't 
involve any b's, one discovers that there are a total of six red-white color 
patterns modulo the symmetry group G = ((1243)) of plane rotations. 



* Named for George Polya (1888-1985). 
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(a) How many different colorings are equivalent, modulo G, tof = (r, r, w, w)? 

(b) How many different colorings are equivalent, modulo G, tof = (r, w, r, w)7 

(c) Confirm that jE^ec 2 ^' = 6 - 

(d) Use Theorem 3.5.4 to compute the number of inequivalent red-white vertex 
colorings of the square modulo D4. 

3 Say that two vertex colorings of a regular pentagon are equivalent if one can be 
obtained from the other by a plane rotation. Suppose n colors are available. 

(a) Use Theorem 3.5.4 to show that there are eight color patterns when n = 2. 

(b) Find a system of distinct representatives for the eight color patterns in 
part (a). 

(c) Show that there are 51 color patterns when n = 3. 

(d) Compute the number of color patterns when n = 4. 

(e) If n is relatively prime to 5, prove that n A + 4 is a multiple of 5. 

(f) If n is relatively prime to 5, prove that n 4 — 1 is a multiple of 5. 

(g) Let p be, not a permutation, but a prime number. If n is relatively prime to p, 
prove that p is a factor of n p ~ x — 1. 

4 Which of the eight inequivalent color patterns in Exercise 3(b) are equivalent 
modulo the group D 5 of all 10 symmetries of a regular pentagon? 

5 Show that there are 39 inequivalent 3-colorings of the vertices of a regular 
pentagon modulo D5. 

6 Suppose n colors are available to decorate the vertices of a regular hexagon. 
Compute the number of color patterns modulo D(, (see Exercise 6(b), 
Section 3.4) when 

(a) n = 2. (b) n = 3. (c) n = 4. 

7 Three of the six rotationally inequivalent red-white-blue colorings of the 
faces of a cube in which each color is used twice are given in Fig. 3.5.5. 
Exhibit the other three 

(a) using pictures. (b) using functions. 

8 Modulo its group of 12 rotational symmetries (see Exercise 9, Section 3.4), 
how many inequivalent n-colorings of the faces of a regular tetrahedron are 
there when 

(a) n = 2? (b) n = 3? (c) n = 4? 

9 Modulo the group of all its symmetries, how many inequivalent n-colorings of 
the faces of a regular tetrahedron are there when 

(a) n = 2? (b) n = 3? (c) n = 4? 

10 Express o({p € S m : c(p) — r}) in terms of Stirling numbers. 
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11 Prove that the falling factorial function 

*(»> = (-i) m j2(-x) c{p) - 

12 There is a natural one-to-one correspondence between binary words of length 3 
and points in three-dimensional space. The word 010, e.g., corresponds to the 
point (0, 1, 0). 

(a) Show that the 2 3 = 8 different binary words of length 3 correspond to the 
vertices of a cube. 

(b) Show that there is a one-to-one correspondence between (3,M,d) codes 
and vertex colorings of the cube using 2 colors. 

(c) If two (3, M, d) codes are defined to be equivalent when the corresponding 
vertex 2-colorings of the cube are equivalent modulo its group of 48 
symmetries, how many inequivalent (3,M,d) codes are there? (Hint: 
Exercise 11, Section 3.4.) 

(d) Suppose c £\ is a (3,Af, d\) code and <&2 is a (3,M 2 ,fi?2) code. If <&\ and ^ 2 
are equivalent (in the sense of part (e)), show that M x — M 2 . Is d\ = d 2 l 

13 In how many inequivalent ways can the eight faces of an octahedron be 
colored, modulo the group of its 24 rotational symmetries, 

(a) using two colors? 

(b) using three colors? 

(c) using ten colors? 

14 In how many inequivalent ways can the eight faces of an octahedron be 
colored, modulo the group of all 48 of its symmetries, 

(a) using two colors? (Hint: Compare with Exercise 12(c).) 

(b) using three colors? 

(c) using ten colors? 

15 Express the number of inequivalent vertex colorings of a regular octagon, 
modulo its group of plane symmetries, as a polynomial in «, the number of 
available colors. 

16 In how many inequivalent ways can the six edges of a regular tetrahedron be 
2-colored, 

(a) modulo the group of its 12 rotational symmetries. (Hint: Exercise 12, 
Section 3.4.) 

(b) modulo the group of all 24 of its symmetries. 

17 Fifteen billiard balls can be racked into a triangular array as shown in 
Fig. 3.5.6. Assume the balls are available in (unlimited quantitites of) red, 
white, and blue. Modulo the symmetry group of plane rotations of the rack, 
how many inequivalent color patterns of balls are possible? 
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Figure 3.5.6 



18 Nuclear magnetic resonance (NMR) is produced by a magnetic field asso- 
ciated with unpaired nuclear spins. There are two possibilities for the spin of 
an ordinary hydrogen nucleus (a proton): spin up and spin down. The NMR 
phenomenon is observed by placing a sample in a steady magnetic field and 
simultaneously exciting the sample with radio waves. The frequency of the 
radiation and the strength of the magnetic field can be adjusted to produce 
absorption of the radio waves. (Among the triumphs of quantum mechanics is 
a theoretical understanding of these, and other, spectral lines.) 

Free hydrogen can exist either as atomic hydrogen, Hi, or as molecular 
hydrogen, H2. Suppose some random cubic meter of intergalactic space 
contains four hydrogen atoms. Imagine using NMR spectroscopy to determine 
whether the hydrogens are in atomic or molecular form. The first step is to 
analyze the various possibilities. Suppose we "color" each of the nucleii using 
two colors: up and down. 

(a) The group of symmetries for the system 4Hi is S4. Show, in this case, that 
five nuclear magnetic states (inequivalent 2-colorings) are possible. (Give 
two arguments, one based on common sense and one based on 
Theorem 3.5.4.) 

(b) Numbering the atoms of one molecule 1 and 2, and the atoms of the 
second 3 and 4, show that the group of symmetries for the system 2H2 is 
{e 4 , (12), (34), (12)(34), (13)(24), (14) (23), (1324), (1423)}. 

(c) How many states are possible for the system 2H\ + H-ft 

19 If A — (djj) is an m x m matrix, then 

m 

det(A) = £(-l)"- c(rt n fl *0- 
The permanent function is defined by 

m 

per(A) = ^2Y[a tJ>{t) , 

pes m t=\ 
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i.e., the permanent is the determinant without the alternating minus signs. Let 
J m be the m x m matrix each of whose entries is 1. Prove that 

(a) per(/,„) = ml 

(b) per(/,„ — I m ) = D(m), the mth derangement number. 

20 Suppose G is a group of symmetries of some objcet O. Denote by N(G, n) the 
number of inequivalent colorings of the faces of O modulo G when n colors 
are available. 

(a) Prove that N(G, s) < N(G, t) whenever s < t. 

(b) Prove that N(G, n) < N(H, n) whenever H is a subgroup of G. 

(c) Must the inequality in part (b) be strict when H is a proper subgroup of G? 

21 Let p be the induced action of p <G S m defined by p( f) — fp^ 1 , f G C mj „. If 
n > 1, prove that p = q if and only if p = q. 



3.6. POLYA'S THEOREM 

A little inaccuracy sometimes saves a ton of explanation. 

— H. H. Munro 

Modulo its symmetry group of plane rotations, there are 24 inequivalent ways to 
color the vertices of a square red, white or blue, a number obtained by identifying 
equivalence classes of colorings with the orbits of an artfully crafted permutation 
group. A system of distinct representatives (SDR) for the 24 color patterns, 
described by means of geometric pictures, can be found in Fig. 3.5.3. With respect 
to the vertex numbering in Fig. 3.6.1, the function manifestation of the SDR appears 
in Fig. 3.6.2. 

During a previous discussion of balls and urns, it was productive at one point to 
deviate from the usual sequence notation and describe functions using words, e.g., 
substituting rrbw for (r, r, b, w). It is a well-documented phenomenon of human 
nature that people typically see what they expect to see. Told to expect a word, 
we look at rrbw and our thoughts turn to pronunciation. Told to expand 
(r + w + b) 4 , we look at rrbw and our thoughts turn to algebraic expressions like 
r 2 wb. Told nothing about what to expect, we could misinterpret rrbw. 



1 2 
O O 

O O 

3 4 

Figure 3.6.1 
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(r, r, r, r) (w, w, w, w) (b, b, b, b) 
(r, w, w, w) (w, r, r, r) (b, w, w, w) (r, b, b, b) (w, b, b, b) (b, r, r, r) 
(r, r, w, w) (r, w, w, r) (w, w, b, b) (w, b, b, w) (b, b, r, r) (b, r, r, b) 

(r, w, b, r) (w, b, r, w) (b, r, w, b) 

(r, r, w, b) (r, r, b, w) (w, w, r, b) (w, w, b, r) (b, b, r, w) (b, b, w, r) 

Figure 3.6.2 

What are the implications of a misinterpretation? Suppose we abbreviate the 
function (r, r, b, w) with rrbw and then, due to some distraction or lapse of concen- 
tration, find ourselves writing r 2 wb. Let's call it the weight of (r, r, b, w). In passing 
from a coloring to its weight, something gets lost. From the weight, we can deter- 
mine which colors are used and how often, but not which vertices get which colors. 
Nevertheless, replacing (r,r,b,w) with the algebraic expression r 2 wb is surprisingly 
useful. 

Observe, first, that equivalent colorings have the same weight. (Rotating a 
colored square isn't going to change the number of its red vertices.) So, it makes 
sense to define the weight of a pattern to be the weight common to every coloring in 
the pattern. What makes things interesting is that inequivalent colorings can also 
have the same weight. Exactly three of the 24 inequivalent colorings represented 
in Fig. 3.6.2, e.g., have weight r 2 wb, namely, (r, w, b, r), (r, r, w, b), and (r,r, 
b, w). The pattern inventory tracks just this sort of information. It is the polynomial 
obtained by summing the weights of the distinct patterns. The pattern inventory for 
the rotationally inequivalent red-white-blue vertex colorings of the square can be 
obtained by replacing each function in Fig. 3.6.2 by its weight and then summing 
the resulting monomials. After combining like terms, the outcome is 

W G (r, w, b) = (r 4 + w 4 + b 4 ) + (rw 3 + r 3 w + w 3 b + rb 3 + wb 3 + r 3 b) 

+ 2(r 2 w 2 + w 2 b 2 + r 2 b 2 ) + 3(r 2 wb + rw 2 b + rwb 2 ). (3.37) 

Note that Wc(l, 1,1) —24, reflecting the fact that each pattern contributes one 
monomial to W c . 

Starting from a system of distinct representatives for the color patterns, as we 
just did, it is easy to write down the pattern inventory. The hard part is finding 
the SDR! The focus of this section involves reversing the process, starting with 
the pattern inventory and using it as a guide while assembling a system of distinct 
representatives. If, e.g., you were in the midst of listing an SDR, perhaps having just 
found a second pattern of weight r 2 w 2 , you would know from Equation (3.37) not to 
waste time searching for a nonexistent third pattern of the same weight. 

All right, how does one find the pattern inventory without first constructing a 
system of distinct representatives? Let's approach it like a mystery and begin 
with the clues. From Equation (3.37), Wc(r,w,b) is a homogeneous polynomial 
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of degree 4 (because there are four vertices) in 3 variables (because there are three 
colors). Moreover, from the physical nature of the problem, Wc(r, w, b) is a 
symmetric polynomial. So, it is a linear combination of minimal symmetric 
polynomials: 

W G (r,w,b) = M^(r,w,b) + Afp ^ (r, w, b) 

+ 2M [2 2 ] {r,w,b) + 3M [2A 2 ] {r,w,b). (3.38) 

(Confirm the equivalence of Equations (3.37) and (3.38).) 

One way to proceed might be to look for an analogue of the multinomial theo- 
rem, a formula for the coefficient of M n in the expansion of Wg as a linear 
combination of minimal symmetric polynomials. If such a formula exists, it has 
not yet been found. What has been discovered is a little different. It is an algorithm 
for expressing Wg as a polynomial in the power sums Mt = Mm , 1 < k < m. (This 
is a little like ordering a hamburger and being served a hot dog!) 

So far, our discussion has been limited to the motivating example of red-white- 
blue vertex colored squares. If that sort of thing were all Polya's theorem is good 
for, it would not be worth mentioning. To enable the full range of applications, we 
need to retrace our steps in a more general setting. 

Recall that C m .„ is the set of all n m functions 

/ : {1,2, . . . ,m} -> {x u x 2 , . ■ ■ ,x n } 

and that each p S S m induces a one-to-one function p : C m . n — > C m>n defined by 
P(f) = fP~ X - If G is a permutation group of degree m, then G — {p : p d G} can 
be viewed as a permutation group of degree n m acting on C m> „. When G is a sym- 
metry group and {x\,x 2 , ■ ■ ■ ,x n } is a set of colors, the orbits of C mn modulo G are 
the color patterns. Finally, from Burnside's lemma and the fact that the number of 
fixed points of p is F(p) — n ci - p \ the total number of color patterns modulo G is 

< = ^g»*>. (339) 

where c(p) is the number of cycles in the disjoint cycle factorization, not of p, but 
of p. 

3.6.1 Definition. Treating the colors x\,x 2 , . . . ,x n that comprise the range of 
/ G C m ,„ as independent variables, the weight off is 

m 

w(/)=n/(o- 

!=1 

Evidently, w(f) is a monomial of (total) degree m. 
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3.6.2 Example. In the case of red-white-blue vertex colorings of the square, 
n = 3,x\ = r,X2 = w, and xj, = b. If, e.g., / = (r, r, b, w), then 

w(/) =/(l)/(2)/(3)/(4) 
= rrbw 

= r 2 wb. □ 

Example 3.6.2 shows that Definition 3.6.1 is consistent with our original notion 
of weight. We now confirm, in the general setting, that equivalent colorings have 
the same weight. 

3.6.3 Lemma. For all p G S m and all f e C m .„, 

w(P(f)) = w{f). (3.40) 
In particular, w(f) = w(g) whenever f and g are equivalent modulo G. 

Proof. Let g = p(f) — fp^ 1 . Then gp = f. Because multiplication is commutative 
and p e S m , 



i=i 

m 

= \{sp{i) 

i=l 
m 

= n/(o 



(=i 

= w(/). ■ 

Suppose P is one of the color patterns (orbits) of C mn modulo G. Iff, g S P, then 
g = p(f) for some p G G and, by Lemma 3.6.3, w(g) = w(f). This brings us, at 
last, to a formal definition of pattern inventory. 

3.6.4 Definition. Suppose G is a permutation group of degree m. Let P\,P2,..., 
P, be the distinct color patterns (orbits) of C mfl modulo G. The weight of P , is the 
common value of w(f),f € P,. The sum of the weights of the orbits is the pattern 
inventory 

t 

W G (x u X2,...,x n ) = ^2w(Pi). (3.41) 

i=i 

Because Wg{1, 1, . • . , 1) = t, the number of patterns, it follows from 
Equation (3.39) that 

w G (i,i,...,i) = ^-j2 nC{p) - ( 3 - 42 ) 
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Now that all the formal definitions are in place, let's return to the issue of 
evaluating Wg(x\,X2, ■ ■ ■ ,x„). While it is important to keep m and G general, no 
real generality is lost if we take n = 3 and set x\ = r, X2 — w, and X3 = b. 

Consider a fixed but arbitrary nonnegative integer solution to the equation 
i + j + k = m. By definition, the coefficient of r l \Jb k in Wc(r, w, b) is the number 
of color patterns of weight r l w>b k . Denote the union of these patterns by C mM (i,j, k). 
Then C mM (i,j,k) is the set of all colorings of weight r'vJb k . 

If p G G then, by Lemma 3.6.3, p permutes the elements of C m . n (i,j,k) among 
themselves. Define Puj^) to be the restriction of p to C m ,„(i,j,k), and let G^j^ — 
{p(i.j.k) '■ P e G}. Then is a permutation group acting on C, %n (iJ,k). More- 

over, two colorings of C mi „(i,j,k) are equivalent modulo Gy,-^) if an d only if they 
are equivalent modulo G. So, the number of orbits of G having weight r'yJb k is 
equal to the total number of orbits of G^j^)- Let's apply Burnside's lemma to 
Gnj t k) an d deduce that the number of color patterns modulo G of weight r l vJb k 
is given by 



^§^,«>>- (343) 
Because Formula (3.43) is the coefficient of r'w'fe* in Wo(r, w, b), it must be that 

W G (r,w,b)= £ f-^E%M)))^' 

i+j+k=m \ ^ > peG J 

= oWX( E F (P(iJ,^A (3-44) 

p€G \i+j+k=m ) 

It remains to evaluate 

E F(P m) y^b k . (3.45) 

i+j+k=m 

Consider an example. If m = 7, the colorings can be identified with functions 
/ : {1,2,3,4,5,6,7} -> {r,w,b}. Let q= (12) (34) (567). Then /is a fixed point 
of q if and only if/(l) =/(2), /(3) =/(4), and/(5) =/(6) =/(7). As we saw 
in the last section, the number of fixed points, F(q), is equal to the number of 
ways to make a sequence of c(q) = 3 decisions each having three choices (namely, 
r, w, or b). Therefore, F(q) — 3 e ' 9 ' = 27. 

Of the 27 fixed points of q, one is / = (r, r, w, w, b, b,b), a coloring of weight 
w(f\) = r 2 w 2 b 3 . Another fixed point of q is fa = (w,w, r, r, b, b, b). Because 

'While this statement is correct, it is not completely justified by the discussion. The problem is that 
P ~ * P(ij,k) nee d not be one-to-one. The argument can be made rigorous by using the tools of abstract 
group homomorphisms, their kernels, and the corresponding quotient groups. 
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w(fi) = r 2 w 2 b 3 = w(/i),F(§( 2 ,2,3)) > 2. A third fixed point of q is (b,b,w,w, 
w 7 w,w), having weight w 5 b 2 . If we listed all 27 fixed points of q and summed 
their weights, the result would be 

E nh^y^ 

i+j+k=m 

= (r 1 + w 7 + b 1 ) + 2(r 5 w 2 + r 5 b 2 + r 2 w 5 + r 2 b 5 + w 5 b 2 + w 2 b 5 ) 
+ (r 4 w 3 + r 4 b 3 + r 3 w 4 + r 3 b 4 + w 4 b 3 + w 3 b 4 ) 

+ 2(r 2 w 2 b 3 + r 2 w 3 b 2 + r 3 w 2 b 2 ), (3.46) 

the special case of Formula (3.45) corresponding to p = q. (From the term 2r 2 w 2 b 3 , 
we deduce that /i and f 2 are the only fixed points of q that have weight r 2 w 2 b 3 .) 

Because / G C73 is a fixed point of q if and only if / is constant on the three 
cycles of q = (12)(34)(567) G S7, Equation (3.46) is an inventory of the weights 
w(f) — c\c\c\, where c\ G {r, w,b} is the color /(l) =/(2),C2 G {r, w, /?} is the 
(not necessarily different) color /(3) —f{A), and c 3 e {r, w,b} is the color 
/(5) =/(6) =/(7). But, there is another way to inventory r/ze.se same 
weightsl From the alternative view of distributivity used, e.g., to prove the binomial 
theorem, 

i+j+k=m 

= (r 2 +w 2 +b 2 ) 2 (r 3 + w 3 + b 3 ) 
= M 2 (r,w,b) 2 M 3 (r,w,b), 

where M^(r, w, /?) = r* + w* + £>* is the ^th power sum. (Confirm that the right- 
hand side of this equation is equal to the right-hand side of Equation (3.46).) 

Returning to the general case, let c,(p) be, not some color, but the number of 
cycles of length i in the disjoint cycle factorization of p G S m . Using the arguments 
illustrated above for q = (12) (34) (567), it follows that the weights of the fixed 
points of p are inventoried by 

E F(p ( uj, k) y^b k 

i+j+k=m 

= {r + w + b) c,[p) (r 2 + w 2 + b 2 ) ci[p) ■■■(r ,n + W n + b m ) cM 
= Mi (r, w, b) Cl(p) M 2 {r, w, b) C2{p) ■ ■ ■ M m {r, w, b) cJp) 

for any p G S m . Substituting this identity into Equation (3.44) yields 

W G (r, w,i)=-^E M ' ( r > w ' br ip) M 2 (r, w, b)* {p) ■ ■ ■ M m (r, w, b) c ^ p) . 
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Figure 3.6.3 



The generalization to n colors is this: 

3.6.5 Polya's Theorem. IfG is a subgroup ofS m , then the pattern inventory for 
the orbits of C m> „ modulo G is 

W G (x u x 2 , ...,*„)= -^ZX 0 ^ 00 • • -KT 00 . (3-47) 
0 ^ U > pec 

where = Afy] (x\,x 2 , ■ . ■ , x n ) = x\ + x\ + ■ ■ ■ + x k n , the kth power sum of the x's. 

So, there it is: an algorithm, depending only on G, for expressing the pattern 
inventory, W G , as a polynomial in the power sums. The unfortunate thing is that 
it should look so complicated. In fact, there is less here than meets the eye. 

Note that Polya's theorem is consistent with our earlier formula for the number 
of patterns: If x\ = x 2 = ■ ■ ■ = x n = 1, then M^ — n for all k. Because 

ci{p) + Clip) + ■■■ + Cm{p) = c(p), (3.48) 

the total number of cycles of p, Equation (3.42) is an easy consequence of 
Equation (3.47). 

3.6.6 Example. Let's apply Polya's theorem to red-white-blue colorings of the 
vertices of a square, modulo the group G = ((1243)) of plane rotations. Substitut- 
ing the information from Fig. 3.6.3 into Equation (3.47) yields 

W G (r, w, b) = X - [(r + w + b) 4 + 2(r 4 + w 4 + b 4 ) + (r 2 + w 2 + b 2 ) 2 ] . (3.49) 

From the multinomial theorem, 

(r + w + b) 4 = Af [4 ] (r, w, b) + 4M |3jl] (r, w, b) 

+ 6M [2t2 ](r,w,b) + l2M [2!l 2]{r,w,b), 

* Polya's 1937 paper revolutionized combinatorial enumeration. In 1960, F. Harary pointed out that many 
of Polya's ideas had been anticipated in 1927 by J. H. Redfield. However, it was only after Polya had 
articulated and explained the ideas that anyone was able to make sense of Redfield's paper. 
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and 

(r 2 +w 2 + b 2 ) 2 = M [2] (r 2 , w 2 , b 2 ) + 2M [l2] (r 2 ,w 2 , b 2 ) 
= M [4] (r,w,b) + 2M |2 2] (r, w, fc) . 

Together with Equation (3.49) and 

2(r 4 +w 4 +b 4 ) = 2M [4] (r,w,b), 

these identities produce 

W G (r, w,b) = - [4M |4] (r, w, fo) + 4M [3jl] (r, w, fo) + 8M |2 2] (r, h>, fc) + UM^ (r, w, b)} 
= M [4] (r, w, fo) + M [3)1] (r, w, fo) + 2M |2 2] (r, w, fc) + 3M [2>1 2] (r, w, fo) , 

which is precisely Equation (3.38). □ 

3.6.7 Example. Let's work out the pattern inventory for the 57 red-white-blue 
color patterns for the faces of the cube, modulo the group G consisting of its 24 
rotational symmetries. To get started, we need an analogue of Fig. 3.6.3, but it 
need not have 24 rows. To evaluate Equation (3.47), all we really need are the num- 
bers of permutations of each cycle type. Because (Example 3.5.5) the permutations 
of G come in five different cycle types, only five rows are needed. In Fig. 3.6.4, the 
column labeled "#" contains the number of permutations of G having the same 
cycle type as the permutation in column "p". Substituting this information into 
Polya's theorem, we obtain 

W G (r, w,b) =^[(r + w + bf + 3(r + w + bf(r 2 + w 2 + b 2 ) 2 

+ 6(r + w + b) 2 (r 4 + w 4 + b 4 ) + 6(r 2 + w 2 + b 2 ) 3 

+ 8(r 3 + w 3 + b 3 ) 2 ]. (3.50) 

Equation (3.50) is the hot dog. It is an expression for the pattern inventory as a 
polynomial in the power sums. What stands between us and the coefficient of M K in 



236 



Polya's Theory of Enumeration 



the expansion of Wc (the hamburger) is a pile of computations. The silver lining 
is that we do not always need the coefficient of M % for every n h m. It 
might happen, e.g., that our interest does not extend beyond patterns of weight 

rw 2 b 3 . 

Okay, what is the coefficient of rw 2 b 3 in Equation (3.50)? Because every term of 
the product 6(r + w + b) 2 (r 4 + w 4 + b 4 ) contains a fourth power, it contributes 
nothing of the form rw 2 b 3 . Since neither 6(r 2 + w 2 + b 2 ) 3 nor 8(r 3 + w 3 + b 3 ) 2 
involves a first power, they cannot contribute terms of the form rw 2 b 3 either. 
From the multinomial theorem, (r + w + b) 6 contributes 60rw 2 b 3 . 

What about 3(r + w + b^ 2 (r 2 + w 2 + b 2 ) 2 l Because, the single r must come 
from the factor (r + w + b) , the contribution from this term is the product 

3 x 2rb x 2w 2 b 2 = \2rw 2 b 3 . 

So, the coefficient of rw 2 b 3 in l¥ c (/-, w, b) is i (60 + 12) = 3. Of the 57 rotation- 
ally inequivalent red-white-blue color patterns of the cube, there are exactly 3 in 
which one face is painted red, two faces are painted white, and three are painted 
blue. (It is important to understand that Polya's theorem tells us nothing about 
how to find distinct representatives for the three color patterns of weight rw 2 b 3 .) 

With the tedious computations all completed, Equation (3.50) yields the 
hamburger 

W G (r, w, b) = M [6] (r, w, b) + M [5jl] (r, w, b) + 2M [4a] (r, w, b) 
+ 2M [4il 2] (r, w, b) + 2M [3 2](r,w,b) 
+ 3M [3i2j i] (r, w, b) + 6M |2 3] (r, w, b) 
= (r 6 + w 6 + b 6 ) + (r 5 w + ■■■+ wb 5 ) + 2(r 4 w 2 + ■■■+ w 2 b 4 ) 
+ 2(/wb + rw 4 b + rwb 4 ) + 2(rV + r 3 b 3 + w 3 b 3 ) 
+ 3(r 3 w 2 b + ■■■ + rw 2 b 3 ) + 6r 2 w 2 b 2 . (3.51) 

Suppose some businessman wanted to manufacture and sell red-white-blue 
painted cubes in all 57 varieties. He might organize his stock in 57 drawers, one 
for each pattern, and make use of a system of distinct representatives to label 
the drawers. It might even make sense to organize the drawers into filing cabinets 
according to weight. Given the one-to-one correspondence between weights, r l Wb k , 
and nonnegative integer solutions of the equation i + j + k = 6, this scheme would 
require C(6 + 3 — 1,6) =28 filing cabinets each having 1, 2, 3, or 6 drawers. A 
customer interested in colorings with 1 red, 2 white, and 3 blue faces could be 
led to the cabinet labeled rw 2 b 3 and offered a choice of three drawers (the coeffi- 
cient of M[3 2.i] (r, w, b) in Wc(r, w, b)). □ 

*By symmetry, 3Mp 2i i](r, w, b) must be a summand of W G (r, w, b). 
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3.6. EXERCISES 

1 Consider G = ((1243)), the group of plane rotations of the square illustrated in 
Fig. 3.6.1. 

(a) Show that W G (r, b) = M [4] (r, b) + M [3]1] (r, b) + 2M [2 2](r, b). 

(b) Express W c (r, b) as a polynomial in the power sums M k = 
M [k] {r,b),\<k<2. 

2 Consider the red-white-blue vertex color patterns of a square, modulo G = D4. 
Compute the pattern inventory Wc(r 7 w,b) 

(a) using Fig. 3.5.3. 

(b) using Polya's theorem. 

3 Let G=((123)), the group of plane rotations of a equilateral triangle, 
expressed as permutations of its vertices. 

(a) Show that, as a polynomial in the power sums M k = M\ k ](r, w, b), 
1 < k < 3, 

W G (r,w,b) = \m]+\M 3 . 

(b) Show that, as a linear combination of minimal symmetric polynomials, 

W G (r,w,b) =M [3] (r,w,b) +M [2 ^(r,w,b) + 2M [l3] (r,w,b). 

(c) Exhibit a system of distinct representatives for the red-white-blue color 
patterns of the vertices of an equilateral triangle modulo G. 

4 Let G — D3, the group of all symmetries of an equilateral triangle, expressed as 
permutations of its vertices. 

(a) Show that 

W G {r, w, b) = i[(r + w + bf + 3(r + w + b)(r 2 + w 2 + b 2 ) 
+ 2(r 3 + w 3 + b 3 )}. 

(b) Show that, as a linear combination of minimal symmetric polynomials, 

W c (r,w,b) = M [3] (r,w,b) +M [2; i](r, w,b) +M [l3] (r, w,b). 

(c) Which red-white-blue color pattern(s) modulo the group of plane rotations 
of the equilateral triangle coalesce into a single pattern modulo D3 = S3? 
(Hint: Exercise 3(c).) 

(d) If (r, w, b) is dropped from each term in part (b), the result is 
Wg = M[3] +M[ 2j i] +M[j3]. How would this expression change if a fourth 
color, say green, were added to the palette? How would it change if there 
were just two colors, say black and blue? 

(e) Prove that Weir, w, b) — Hj,(r, w, b), the homogeneous symmetric function 
of degree 3 from Exercise 25, Section 1.8. 
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5 Let G = ((12345)), the group of rotational symmetries of a regular pentagon, 
expressed as permutations of its (consecutively numbered) vertices. 

(a) Show that 

W G (r, w, b) = \[{r + w + b) 5 + 4(r 5 + w 5 + b 5 )}. 

(b) Show that W G (l, 1, 1) = 51. 

(c) Show that 

W G {r,w,b) =M [5] (r,w,b) + M [4A] (r,w,b) + 2M [X2] (r,w,b) 
+ 4M [3>1 2] (r, w, b) + 6M [2 2 A] (r, w, b) . 

(d) Exhibit a system of distinct representatives for the four color patterns of 
weight rw 3 b. 

(e) Exhibit a system of distinct representatives for the six color patterns of 
weight rw 2 b 2 . 

6 Consider red-white-blue vertex colorings of the regular pentagon modulo D$, 
the group of all 10 of its symmetries. (See Exercise 5 in Sections 3.4 and 3.5.) 

(a) Show that 

W D5 (r, w, b) = i[(r + w + b) 5 + 5{r + w + b) (r 2 + w 2 + b 2 ) 2 
+ 4(r 5 + w 5 +Z? 5 )]. 

(b) Show that W Ds (l, 1, 1) = 39. 

(c) Show that 

W D5 (r,w,b) =M [5] (r,w,b)+M l4 , 1] (r,w,b) + 2M lX2 \(r,w,b) 
+ 2M [X ,2] (r, w, b) + 4M [2 2 il] (r,w,b). 

(d) Use part (c) to prove that W D ,(l, 1, 1) = 39. 

(e) Exhibit a system of distinct representatives for the two color patterns of 
weight rw^b. 

(f) Compare and contrast your answer to part (e) with your answer to 
Exercise 5(d). 

(g) Exhibit a system of distinct representatives for the four color patterns of 
weight rw 2 b 2 . 

(h) Compare and contrast your answer to part (g) with your answer to 
Exercise 5(e). 

(i) If green were to become available, so that the set of colors is {r, w,b,g}, 
show that 



W D5 (r, w, b, g) = i[(r + w + b + g) 5 + 5(r + w + b + g){r 2 + w 2 + b 2 + g 2 ) 2 
+ 4 (r 5 + w 5 + b 5 +g 5 )}. 
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(j) Show that W Ds {\, 1, 1, 1) = 136. 
(k) Show that 

W Ds (r, w, b, g) = M [5] + M [4jl] + 2M [3;2] + + 4M [2 2 jl] + 6M [2il3] , 

where M„ = M K (r, w,b,g). 
(1) Use your answer to part (k) to confirm that W Ds (l, 1,1,1) = 136. 
(m) Express Wd 5 (7, w 7 b,g,p) as a linear combination of minimal symmetric 

polynomials M n (r,w,b,g,p),n h 5. 

7 Consider vertex color patterns of a regular hexagon modulo G — £>6, the group 
of all 12 of its symmetries. (See Exercise 6(b), Section 3.4.) 

(a) Express Wc(r, w,b) as a linear combination of minimal symmetric 
polynomials M K = M K (r, w, b), n h 6. 

(b) Exhibit a system of distinct representatives for the color patterns of weight 
Pw 2 b. 

(c) Exhibit a system of distinct representatives for the color patterns of weight 
r 2 w 2 b 2 . 

8 Exhibit six rotationally equivalent red-white-blue colorings of the faces of a 
cube, all having weight r 2 w 2 b 2 , and indicate which pairs are equivalent by a 
reflection. (Hint: Exercise 7, Section 3.5.) 

9 Let G be the group of 24 rotational symmetries of a (regular) octahedron 
expressed as permutations of its eight faces. (So, G is comprised of the 
permutations p in Fig. 3.4.7.) Express Wc(r, w, b) as a linear combination of 
the minimal symmetric polynomials M n = M n (r, w, b), n h 8. 

10 Modulo its group of 24 rotational symmetries, the faces of a regular 
octahedron have 333 inequivalent red-white-blue color patterns. (See 
Exericse 9.) How many of these have weight 

(a) r 8 ? (b) Pbl (c) Pw 3 b 2 ? 

(d) r A w 2 b 2 l (e) w 4 b 4 l (f) r 4 wb 3 l 

11 Use Polya's theorem to compute the number of red-white-blue color patterns 
of the faces of a cube that have weight r 2 w 2 b 2 , modulo the group of all 48 of 
its symmetries. (Hint: Be sure your solution is consistent with Exercise 8.) 

12 Show that 

(xi +x 2 H hx„) m = ^2 w(f). 

fec m ,„ 

13 Let G be the group of 12 rotational symmetries of the faces of a regular 
tetrahedron. Express Wo(r, w, b, y) as a linear combination of minimal sym- 
metric polynomials M K = M n (r, w, b, y), % h 4. 
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14 Let G be the group of all 24 symmetries of the faces of a regular tetrahedron. 

(a) Express Wc(r, w, b,g) as a linear combination of minimal symmetric 
polynomials M K = M K (r 7 w,b,g),n\- 4. 

(b) Prove that Wc(r, w, b, g) = H^r, w, b, g), the homogeneous symmetric 
function of degree 4 from Exercise 25, Section 1.8. 

15 Let G be the group of 12 rotational symmetries of the regular tetrahedron 
expressed as permutations of its six edges. (See Exercise 12, Section 3.4.) If 
two colors are available, x and y, how many rotationally inequivalent 2- 
colorings of the edges of the tetrahedron have weight 

(a) x 6 ? (b) x 5 yl (c) x 4 y 2 l (d) *V? 

16 Let G be the group of all 24 symmetries of a regular tetrahedron expressed as 
permutations of its six edges. (See Exercise 15.) Expand W G (x, y) as a linear 
combination of minimal symmetric polynomials M K = M,i(x,y),7r h 4. 

17 How many rotationally inequivalent ways are there to rack 15 billiard balls in a 
triangular array if there are 5 balls each of three different colors, say, red, 
white, and blue? (Hint: See Exercise 17, Section 3.5.) 

18 Suppose G is a group of symmetries of the "features" (e.g., vertices, faces, or 
edges) of some geometric object. How many red-white-blue color patterns 
(modulo G) of the features use all three colors if 

(a) G is the group of plane rotations of the vertices of a square? 

(b) G is the rotational group of the faces of a cube? 

(c) G = D 5 , acting on the vertices of a regular pentagon? 

(d) G = Dfr, acting on the vertices of a regular hexagon? 

(e) G is the group of 12 rotational symmetries of the edges of a regular 
tetrahedron? 

(f) (get ready for some serious computation) G is the group of 24 rotational 
symmetries of the faces of a regular octahedron. 

19 The chemical formula for benzene is CeH6. It is possible to form new 
compounds by substituting various atoms, or groups of atoms, for one or 
more of the hydrogens. Benzenediol, e.g., is the generic name for CgH4(OH) 2 , 
obtained by subsituting OH groups for two hydrogens. Benzenediol comes in 
three variations, pyrocatechol (melting point 105°C), resorcinol (melting point 
110°C) and hydroquinone (melting point 171°C). Moreover, dichlorobenzene 
(C6H4CI2), dinitrobenzene (CeH^NCh^), and a host of other compounds 
obtained by substituting for two of the hydrogens in benzene invariably come 
in families of three. From this (and other information), Baron August Kekule 
von Stradonitz was able to deduce the structure of benzene. 

Consider two early models (neither of which seems to satisfy the valence 
condition). In one model, the six carbon atoms are found at the vertices of a 
regular hexagon and are bonded to their two nearest neighbors and to one 
hydrogen atom. In the other model, the carbon atoms are found at the vertices 
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of a regular octahedron and are bonded to their four nearest neighbors and to 
one hydrogen atom. (Note that both of these models satisfy the chemical 
formula CeH6.) Now, replace two of the six hydrogen atoms with bromine. 
Color a carbon atom H if it is bonded to hydrogen and B if it is bonded to 
bromine. 

(a) Modulo Dg, how many inequivalent 2-colorings of the vertices of a regular 
hexagon have weight H 4 B 2 ? 

(b) Modulo its group of rotational symmetries, how many inequivalent 
2-colorings of the vertices of a regular octahedron have weig ht H 4 B 2 ? 

(c) Which model, the hexagon or octahedron, is consistent with the 
experimental data? 

(d) Is Polya's theorem the right way to solve this problem? Why or why not? 

20 In how many inequivalent ways, modulo its rotation group, can the faces of a 
truncated icosahedron (see Fig. 3.4.9) be 2-colored if all the hexagons have to 
be the same color and all the pentagons have to be the same color? 

3.7. THE CYCLE INDEX POLYNOMIAL 

The cycle index knows many things. 

— George Polya 

Suppose G is a group of symmetries of the m features* of some geometric object. 
Let {x\,X2, ■ ■ ■ ,x„} be a set of colors.'' Then the pattern inventory 
W c (x u x 2 , . . . ,x n ) is a polynomial, symmetric in the variables X\,X2, • ■ • ,X n . 
Thus, by Theorem 1.9.11, Wg(xi,X2, ■ ■ ■ ,x n ) is a polynomial in the power sums 

M, = M [t] (xi,x 2 , . . . ,x n ) 

= x' 1 +x t 2 -\ I-jcJ,, 1 < t < n. 

In fact, Polya's theorem is neither more nor less than an algorithm for constructing 
that mysterious polynomial. A little preparation will help clarify this point. 

We are assuming G is a permutation group of degree m (because our geometric 
object has m features). Recall that c,(p) is the number of cycles of length t in the 
disjoint cycle factorization of p G G. Because each integer in {1,2, ... ,m} is 
contained in exactly one of these cycles, 

m = ci(p) + 2c 2 (p) H h tc t (p) H h mc m (p). (3.52) 

(Equation (3.52) is not the same as c{p) = c\{p) + C2(p) + h c m (p).) 



"Features could be vertices, edges, faces, or even hyperfaces. 

f Colors might be anything from red and white to in and out or spin up and spin down. 
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3.7.1 Definition. Let G be a permutation group of degree m. If si, s 2 , ■ ■ ■ , s m are 
independent variables, the cycle index polynomial of G is 



Z G { Sl ,s 2 ,...,s m ) = — —r V«i 
o(G) 



pi(p) s c2(p) _ _ _ c c„,( P ) 



j m 



3.7.2 Example. If G — {e 4 , (12), (34), (12)(34)}, then 

Z G (5i,5 2 ,S3,*4) = \(s 4 i + 2sfs 2 + 4) . (3.54) 

(Don't forget the cycles of length 1.) For the dihedral group D4 = {e/j, (1243), 
(14)(23), (1342), (12)(34), (13)(24), (14), (23)}, 

Zd 4 (si,S2,s 3 ,s 4 ) = l(sj + 2s 4 + 3sj + 2s\s 2 )- (3.55) 

Finally, let H = {e 4 , (13) (24), (12) (34), (14)(23)}. Then H is a subgroup of D 4 . 
The cycle index polynomial of H is 

Z H (s l ,s 2 ,s 3 ,s 4 ) = |(4 + 3s|). (3.56) 

Nominally among the variables of all three cycle index polynomials, 53 actually 
appears in none of them. Similarly, 54 is missing from the right-hand sides of 
Equations (3.54) and (3.56). □ 

Well, that's it. The cycle index polynomial is the mystery polynomial. In 
particular, Polya's theorem can be restated as 



W G (X U X 2 , . . . ,X„) = ^-Y,< P)M 2 iP) ■ ■ 

= Z G (M u M 2 ,...,M m ). (3.57) 



That is to say, the pattern inventory Wg(x\,x 2 , ■ ■ . ,x„) is obtained from the cycle 
index polynomial Z G (s\ ,s 2 ,...,s m ) by substituting = x\ + x\ + ■ ■ ■ + x k n ,\ < 
k < m. 

Computing a cycle index polynomial can be a bit complicated. Here are two 
hints to help avoid common mistakes: 

* As abstract groups, G and H are isomorphic, yet their cycle index polynomials are different. 
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1. Zg(s\ , S2, ■ ■ ■ , s m ) is an average of monomials, one for each permutation in G. 
So, the sum of its coefficients is 1 . 

2. In each monomial term JJ s, , the sum of the products, tc,(p), is the degree 
of G (See Equation (3.52).) 

Confirm that these rules hold in Example 3.7.2 and for the cycle index polynomial 

Zs 4 (s u s 2 ,s 3 ,s 4 ) =^{4 + 3^ + 6 *4 + 6*1*2 + 8*1*3). (3.58) 

Obviously important because of its association with Polya's theorem, the cycle 
index polynomial emerges in other contexts as well. Notice, e.g., that *i occurs in 
Y[ s c t ,lyI ^ if and only if c\(p) > 0, if and only if p has a fixed point. It follows that the 
mth derangement number D(m) can be read directly from Z$ m (or, more accurately, 
from m\Zs m ). Apart from m\ in the denominator, D(m) is the sum of the coefficients 
of the terms that do not contain *i. From Equation (3.58), e.g., 

D(4) = 3 + 6 = 9. 

3.7.3 Definition. To simplify the notation, denote the cycle index polynomial 
for S m by Z m , i.e., 

Z m = Z m {s\ , *2i • • • j *m) 
= Z Sm (*i,* 2 , • • • i^m)- 

We will return momentarily to the substitution s k = x\ + x\ + ■ ■ ■ + x k nl 
1 < k < m. Meanwhile, the next result involves a different substitution. 

3.7.4 Theorem. Setting s k ~x in m\Z m (s\ , *2, . . . ,* m ), 1 < k < m, yields 

m 

m\Z m (x,x, . . . ,x) = y~\s(m, r)x r , 

where s(m, r) is a Stirling number of the first kind, 1 < r < m. 
Proof 

m\Z m (x,x, ...,x) = y^ j x c( - p \ 

where c(p) — c\{p) + C2(p) + h c m (p) is the number of cycles in the disjoint 

cycle factorization of p. The coefficient of x r on the right-hand side of the equation 
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is o({p G S m : c(p) = r}) = s(m,r), the number of permutations whose disjoint 
cycle factorizations consist of (exactly) r cycles. ■ 

3.7.5 Example. From Equation (3.58), 

24Z 4 (5i,S2,*3,-S4) = 4 + 3 ' S 2 + 6 ' V 4 + 6s^S 2 + %SiS 3 . 

So, 

2AZa{x 1 x, x, x) = x 4 + 3x 2 + 6x + 6x 3 + Sx 2 
= x 4 + 6x 3 + 1 lx 2 + 6x. 

By Theorem 3.7.4, the coefficients (in reverse order) are s(4, 1) =6, s(4, 2) = 
11,5(4,3) = 6, and 5(4,4) = 1, consistent with the fourth row of Fig. 2.5.2. 

Wait a minute. It is customary when writing a polynomial in the single variable x 
to begin with the highest power of x. It is clear from Example 3.7.5, however, 
that reversing the terms of 24Z\{x,x, x, x) gives 6x + 1 lx 2 + 6x 3 + x 4 = 
s(4, \)x + s(4, 2)x 2 + s(4, 3)x 3 + s(4, A)x 4 = g^{x), the generating function from 
Equation (2.29). 

3.7.6 Corollary. Setting Sk = x in m\Z m (s\ , S2, ■ ■ ■ , s m ), 1 < k < m, yields 

x tip) = x(x + l)(x + 2) • • • (x + m - 1). 



Proof. Theorems 2.5.4 and 3.7.4. ■ 
This might be a good time to reconfirm that 

x(x + l)(x + 2){x + 3) = x 4 + 6x 3 + llx 2 + 6x. 

Recall (Theorem 1.7.5) that there are C(m + n — l,m) different monomials of 
degree m 'mn variables. Thus, a fixed but arbitrary coloring/ S C m n might have any 
one of C(m + n — 1, m) different weights w — w(f). Our interest in the pattern in- 
ventory stems from the fact that inequivalent colorings can have the same weight. 
The coefficient of w in Wg(*i,*2, • • • ,x„) is the number of color patterns of 
weight w. 

Suppose f,g&C mfl . Then w(f) = w(g) if and only if the sequences 
/ = (/(l),/(2), • • • ,/(m)) and g = (g(l),g(2), . . .,g(m)) contain the same colors, 
with the same multiplicities, if and only if the sequence g is some rearrangement of 
the sequence/, if and only if g ~fp for some permutation p S S m , if and only if 
p(g) = gp^ 1 =/ for some p G S m . In other words, two color patterns modulo S m 
are equal if and only if they have the same weight. It follows that the pattern inven- 
tory for S m is a sum of all C(m + n — 1 , m) monomials of degree m in x\ , Xi, . . . , x„, 
each occurring with multiplicity 1. 
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Let's work out the pattern inventory for S4 when n = 3. Setting x\ = x,X2 = y, 
and X3 = z in Equations (3.57) and (3.58) yields (check it) 

W Si (x,y, z) = Z 4 (x + y + z, x 2 +y 2 + z 2 , x 3 + y 3 + z 3 , x 4 + y 4 + z 4 ) 
= ^[(x + y + z) 4 + 3(x 2 +y 2 +z 2 ) 2 + 6(x 4 +y 4 + z 4 ) 

+ 6(x + y + z) 2 (x 2 +y 2 + z 2 ) + 8(x + y + z)(x 3 + y 3 + z 3 )] 
= [x 4 + y 4 + z 4 ] + [x 3 y + x 3 z + xy 3 + xz 3 + y 3 z + yz 3 ] 
+ [x 2 y 2 + x 2 z 2 + y 2 z 2 ] + [x 2 yz + xy 2 z + xyz 2 ] 

+ M |4] (x,y,z) + M [3;1] (x,y,z) + M [22] (x, y, z) + M [2jl 2](x,y,z). (3.59) 

As predicted, each of the C(4 + 3 - 1,4) = 15 monomials of degree 4 occurs 
exactly once in Equation (3.59). 

3.7.7 Definition. The mth homogeneous symmetric function H m {x\,x 2 , . . . ,x„) 
is the sum of all C(m + n — 1 , m) monomials of (total) degree m in the variables 
x\ , X2, . . . , x rt , i.e., 

H m (xi,x 2 , ■ ■ ■ ,x n ) = ^M It (x 1 ,x 2 , . . . ,x„), 

K 

where the summation is over the partitions n of m having at most n parts. 
Definition 3.7.7 is extended by defining //o(*i>*2, • • • ,x„) = 1. 

3.7.8 Theorem. The mth homogeneous symmetric function 

(a) H m (x u x 2 , . . . ,x n ) = Z m (M u M 2 , ■ ■ . ,M m ), 

where M, = M[,j (x\ , x 2 , ■ . ■ , x„) — x\ + x' 2 + . . . + x' n , 1 < t < m, and 

m 

(b) H m (x u x 2 ,...,x n ) = ^2 Y[x m , 

/SG m ,„ (=1 

where G m> „ C F mfl is the set of all C(m + n — 1 , m) nondecreasing functions from 
{1,2, ... ,m} into {1,2, ... ,n}. 

Proof. Part (a) summarizes the previous discussion. Since the terms in a product 
of commuting variables can be rearranged into nondecreasing order, part (b) is just 
a restatement of the definition. ■ 

The homogeneous symmetric functions have many properties reminiscent of the 
more glamorous elementary symmetric functions. For one thing, Theorem 3.7.8(b) 
is the natural analog of Theorem 2.1.9: 

m 

E m (xi,x 2 , . . . ,x„) = H 1 /!')' 
feQ v 1=1 
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where g m> « is the subset of F, M consisting of the C(m, n) (strictly) increasing func- 
tions. Another similarity involves Stirling numbers. Recall (Corollary 2.5.5) that the 
Stirling number of the first kind, s(m, n) = E m -„(1,2, . . . , m — 1). 

3.7.9 Theorem. The Stirling number of the second kind, S(m,n) = 
H m - n {\,2, . . . ,n), 1 <n<m. 

Proof. Define h(m, n) = fl m _„(l,2, . . . ,n), 1 < n < m. Because the arrays 
h(m,n) and S(m,n) satisfy the same initial conditions and the same recurrence 
(see Exercise 11, Section 2.1), they must be identical. ■ 

The next identity relates binomial coefficients to cycle structure. 

3.7.10 Corollary. If m and n are any two positive integers, then 



Proof. Set x\ = X2 = ■ ■ ■ = x n = \ in Theorem 3.7.8 or set x = n in 



Polya's theorem can be found in a 1937 paper entitled Kombinatorische Anzahl- 
bestimmungen fur Gruppen, Graphen und chemische Verbindungen (Combinatorial 
Enumeration for Groups, Graphs, and Chemical Compounds). The part about 
graphs will be discussed in Chapter 5. That application requires the cycle 
index polynomial of the so-called pair group, bringing us to the final topic of 
this chapter. 

3.7.11 Definition. Let V = { 1, 2, . . . , m}, and define V' 2 ' to be the family of all 
C(m, 2) two-element subsets of V. For each p S S m , let p be the natural action of p 
on y< 2 ' defined by p({i,j}) = {p{i),p{j)}^ The pair group = {p ■ p € S m }. 

. (2) 

Because pq = pq for all p,q £ S m (see Exercise 6), S„, is closed, i.e., it is a 
permutation group acting on V^K Because o(V' 2 ') = C(m,2), we may view sff 
as a subgroup of 5 C ( m>2 ). (Since o(Sm') = m\ is much less than [|m(m— 1)]! = 
o(S C (m t 2)),sff is a relatively small subgroup of S C ( m ,2)-) 




Corollary 3.7.6. 



*See G. Polya and R. C. Read, Combinatorial Enumeration of Groups, Graphs, and Chemical Compounds, 

Springer- Verlag, New York, 1987. 

"'"Similar induced actions can be found in Section 3.4. 
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(12) 


(24) (35) 
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(124) (365) 
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(1342) 


(1265) (34) 
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(12) (56) 
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(135) (264) 


(1423) 


(16) (2354) 


(24) 


(13) (46) 


(143) 


(154) (236) 


(1432) 


(1364) (25) 


(34) 


(23) (45) 


(234) 


(123) (465) 
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(13) (24) 


(16) (34) 
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(132) (456) 


(14) (23) 


(16) (25) 



Figure 3.7.1. The pair group S { 4 > = {p : p e S 4 }. 



3.7.12 Example. If m = 4, then V< 2 > = {{1, 2}, {1, 3}, { 1, 4}, {2, 3}, {2, 4}, 
{3,4}}. Numbering the elements of in dictionary order, using boldface numer- 
als, we have 

1 = {1,2}, 2 = {1,3}, 3 = {1,4}, 
4 = {2,3}, 5 = {2,4}, 6 = {3,4}. 

Suppose p = (123) G ^4. Let's compute the disjoint cycle factorization of p G S^: 

p(l) =p({l,2}) = M1),P(2)} = {2,3} = 4, 
/ 5(4)=p({2,3}) = { /5 (2),p(3)} = {3,l}=2, 
p(2)=p({l,3}) = {p(l),p(3)} = {2,l} = l. 

So, (142) is a cycle of p. Continuing, 

p(3) =p({l,4}) = {p(l),p(4)} = {2,4} = 5, 
p(5) =p({2,4}) = {p(2),p(4)} = {3,4} = 6, 
p(6) =p({3,4}) = {p(3),p(4)} = {1,4} = 3. 

Therefore, p = (142) (356) G sf \ All 4! = 24 elements of the pair group sf ] can 
be found in Fig. 3.7.1. □ 

Using Fig. 3.7.1, it is easy to produce the cycle index polynomial 

Z 5 ( 2) (si, s 2 , . . . , s 6 ) = — (^ + 9s\s\ + Ssj + 6s 2 s 4 ) . (3.60) 

On the other hand, if all we want is its cycle index polynomial, it is not necessary to 
compute the disjoint cycle factorization of every element of S$ . 

(2) 

3.7.13 Lemma. Let p and q be the elements of S m induced by the permutations 
p and q ofS m , respectively. If p and q have the same structure, then p and q have the 
same cycle structure. 
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Proof. Let p e S m . Fix ij G {1, 2, . . . , m}. Let p\ be the permutation obtained by 
interchanging the positions of (' and j in the disjoint cycle factorization of p. Then p\ 
has the same cycle structure as p. Moreover, p l can be obtained by interchanging 
the positions of r k = {i, k} and t k = {j, k} in the disjoint cycle factorization of p for 
each k different from i and j. In particular, p\ and p have the same cycle structure. 
Because p and q have the same cycle structure if and only if q can be obtained from 
p by a sequence of such interchanges, the proof is complete. ■ 

3.7.14 Example. The converse of Lemma 3.7.13 is false. If p = (12) and q = 
(13)(24), then (Fig. 3.7.1) both p and q have cycle type [2 2 , l 2 ]. □ 

(2) 

It follows from Lemma 3.7.13 that the cycle index polynomial for S m is just a 
modification of Z,„. 

3.7.15 Example. Recall from Equation (3.58) that 

Z 4 (suS2,s 3 ,s 4 ) = ^(4 + 3*2 + 6s 4 + 6s\s 2 + 8sis 3 ). (3.61) 

To see how Z 4 can be modified to obtain the cycle index polynomial for the pair 
group sf\ observe that since e 4 = e6, the monomial of s 4 in Z 4 should be replaced 
with Sj. Because (see Fig. 3.7.1) the induced action on V' 2 ' of p = (12)(34) is 
p = (25) (34), the term 3s\ in Z4, corresponding to the three permutations in S4 
of cycle type [2 2 ], is replaced with 3s]s^. For the same reason, 6s 4 is replaced 
with and 6s^S2 with 6s^2- When 8i 2 is substituted for 85^3 and like terms 
are combined, the transformation of Equation (3.61) into Equation (3.60) is 
complete. □ 

3.7.16 Example. The cycle index polynomial for Sj is 



jlo^j 0 + \Qs\s\ + 2Q Sl s 3 3 + 15s 2 4 + 30* 2 i 4 + 20 Sl s 3 s 6 + 24s 2 ] . □ 

(2) 

3.7.17 Example. The cycle index polynomial for S 6 is 



L[sJ 5 + 15s]s 4 2 + 40s]s 4 3 + 60s]s 6 2 + 180*is 2 5 4 + 144^ + l20 Sl s 2 4^ 



720 



+ 40^ + 120*3^]. □ 



3.7. EXERCISES 

1 Compute Z3, the cycle index polynomial for S3. 

2 Use the result of Exercise 1 

(a) to compute the derangement number D(3). 

(b) to confirm Theorem 3.7.4 when m = 3. 
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(c) and Theorem 3.7.8(a) to compute H^(x,y). 

(d) Use Theorem 3.7.8(b) to compute H 3 (x,y). 

(e) Use Theorem 3.7.9 and your answer to parts (c) and (d) to compute S(5, 2). 

(f) Modify the approach of part (e) to compute S(6, 3). 

3 Confirm Corollary 3.7.10 when m = 3. 

4 Compute the cycle index polynomial for the cyclic group 
(a) G= ((12345)). (b) G= ((123456)). 

(c) G= ((1234567)). (d) G = ((12345678)). 

5 Let G be the rotational symmetry group of a cube. 

(a) If G is expressed as permutations of the faces of the cube, show that 

Zg(s u s 2 , ■ ■ ■ ,s 6 ) = U. s \ + 3s2 A + 6s i^4 + 6s\ + 8sf]. 

(Note that 55 and s$ are missing from the right-hand side of this expression.) 

(b) If G is expressed as permutations of the vertices of the cube, show that 

Z G (i, ,s 2 ,...,s s )= i[sf + 9s* + 6s 2 4 + 8s?af]. 

6 Let p and q be the elements S„, induced by the permutations p and q of S m , 
respectively. Prove that 

(a) pq=pq. (b) p' 1 =p~K 

7 Use Fig. 3.7.1 to confirm Exercise 6(b) when 
(a) P =(123). (b) /> = (1234). 

(c) p = (1324). (d) p = (1423). 

8 Compute Z5 (s\ , S2, ■ ■ ■ , J5), the cycle index polynomial for S5 . 

9 Use the result of Exercise 8 

(a) to compute the derangement number D(5). 

(b) to confirm Theorem 3.7.4 when m = 5. 

(c) and Theorem 3.7.8(a) to compute Hs(x,y). 

(d) Use Theorem 3.7.8(b) to compute Hs{x,y). 

(e) Use Theorem 3.7.9 and your answer to parts (c) and (d) to compute 
5(7,2). 

10 Suppose the group of symmetries of the m faces of some object is G = S m . 

(a) Show that two colorings of the faces of the object are equivalent if and 
only if they have the same weight. 

(b) Give a combinatorial argument, independent of Corollary 3.7.10, to show 
that the number of inequivalent ra-colorings of the m faces of the object is 
C(m + n — l,m). 
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11 Show that the partial derivative of Z5 with respect to s\ is Z4. 

12 Show that the partial derivative of Z m with respect to s m is (m — 1)!. 

13 Confirm Example 3.7.16. 

14 Define Zo = 1 (and recall that Z m = Z m (s\ , s 2 , ■ ■ ■ , ■?„,)). It can be shown that 

m 

mZ m = y^^s k Z m - k . 

k=l 

(a) Confirm this result when m = 4. 

(b) Use this result to compute Z&. (Hint: Exercise 8.) 

(c) Confirm your answer to part (b) by computing Zg directly from the 
definition of cycle index polynomial. 

15 Use the result of Exercise 14(b) or (c) to 

(a) evaluate the derangement number D(6). 

(b) confirm the m = 6 case of Theorem 3.7.4. 

(2) 

16 Show that the pair group S4 is the group of (all 24) symmetries of a regular 
tetrahedron expressed as permutations of its six edges. (See Exercise 16, 
Section 3.6.) 

17 Let G be the rotational symmetry group of a regular dodecahedron, expressed 
as permutations of its 12 faces. (See Exercise 17, Section 3.4.) Show that 

Z c (s u s 2 , . . . )ffl2 ) = i(s| 2 + 164 + 20*3 + 24s]s 2 5 ). 

18 Prove that 

— ^ s(m, r)n r = C(m + n — 1 , m) . 

19 Prove that 



k m = ^(-l) m+r r!5(m, r)C{k + r - l,r). 

r=l 

20 Prove that 



ki k 2 k m 



Z m (s u s 2 , ...,s m )-J2 lkikl W2k 2 \ . . .m^kj ' 

where the sum is over all nonnegative integer sequences k\,ki,...,km 
satisfying k\ + 2k 2 + 3A: 3 + • • • + mk m = m. (Hint: Exercise 19, Section 2.4.) 
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21 The analogue of Theorem 3.7.8(a) for elementary symmetric functions, 
namely, 

E m (x u x 2 , ■ . . ,x„) = (-l) m Z m (-Mi, -M 2 , -M m ), 

can be proved using Newton's identities. Show that 

(a) Equation (1.40) in Section 1.9 is the m — 2 case of this equation. 

(b) Equation (1.41) is the m = 3 case of this equation. 

(c) Equation (1.42) is the m — 4 case of this equation. 

22 If 



U m = 



( Sl 


1 


0 


0 


■ 0 


0 


\ 






2 


0 


• 0 


0 




S3 


S2 


si 


3 


• 0 


0 




Sm-l 


Sm-2 


Sm-3 


Sm— 4 


• Sl 


m — 


1 


\ s m 


S m -l 


Sm-2 


Sm-2 ■ 


• s 2 


Sl 


/ 



then per(C/ m ) = m\Z m (s\ 1 S2 1 ■ ■ ■ ,s m ), where "per" is the permanent function 
defined in Exercise 19, Section 3.5. Confirm this formula when 
(a) m = 2. (b) m = 3. (c) m = 4. 

23 Let L m be the matrix obtained from U m (Exercise 22) by replacing s f with 
M t = M\Ax\,x 2 , ■ ■ ■ ,x n ), 1 < t < m. Then (Exercise 20, Section 1.9), 
det(L m ) = m\E m (x\ ,x 2 , ■ ■ ■ ,x n ). It follows from Theorem 3.7.8(a) and 
Exercise 22 that per(L m ) = m\H m (x 1 ,X2, . . . ,x n ). Confirm this formula for 
H m when 

(a) m — 2. (b) m = 3. (c) m = 4. 

24 Confirm the computations leading to Equation (3.59). 

25 Use Theorem 3.7.9 to compute 
(a) 5(6,4). b S(5,3). 

26 Prove Cauchy's identity: ^{\ kl k\\2 kl k 2 \ ■ ■ ■m km k m ^~ = 1' where the sum is 
over all nonnegative integer sequences k\,k 2 ,...,k m satisfying k\ + 2k 2 + 
3&3 + • • • + mk m = m. (Hint: Exercise 20.) 
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On a superficial level, a generating function is simply a way to exhibit a sequence of 
numbers dQ,a\,a2, However, the act of writing 

g(x) = flo + 0\X + Cl2X 2 + • • • 

has some surprising consequences. Because the left-hand side of this expression 
looks like a function, it is tempting to treat the right-hand side as if it were one, 
a "mistake" having some interesting implications. 

Those sequences a Q , a.\ , a 2 , . . . with the property that a n is a polynomial function 
of n are characterized in the first section. Ordinary generating functions and some 
of their properties are discussed in Section 4.2. Applications, e.g., to Newton's 
binomial theorem, are the focus of Section 4.3. Section 4.4 deals with some varia- 
tions on the generating function idea. Techniques for solving recurrences occupy 
the final section. 

Apart from the observation in Section 4.2 that the pattern inventory is a gener- 
ating function, one that doesn't generate anything but is generated by the cycle 
index polynomial, Chapter 4 is independent of Chapter 3. Readers may go directly 
from Chapter 2 to Chapter 4. Natural places to exit from Chapter 4 are the ends of 
Sections 4.1 or 4.3, just before Definition 4.4.9 in Section 4.4, or at the end of 
Section 4.4. 



4.1. DIFFERENCE SEQUENCES 

A standard feature of American education in the mid-twentieth century was the so- 
called IQ test. Typical of these tests were pattern recognition problems like this: 

6, 13, 20, 27, _, (4.1) 

it being understood that one should find the next number in the sequence after 
27. Because the next Sunday after June 27, 2004, is the fourth of July, it may be 
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that the answer is 4. Doubtless the author of the test had another answer in mind, 
probably 34. 

4.1.1 Definition. The notation {a n } is used to denote the sequence ao,a\ ,a2, . ■ . ■ 

Note that the first number in the sequence {a n } is the zeroth term, «o. The 4th 
number in Sequence (4.1) is 27 = a 3 . (While this system may seem awkward now, 
it will simplify our work later on.) 

4.7.2 Definition. The sequence {a n } is arithmetic if, for all n > 0, the difference 
a n+ \ — a n = d is a constant, independent of n. 

An arithmetic sequence satisfies the pattern, or recurrence, a n+ \ = a„ + d, 
n > 0. Given that Sequence (4.1) comprises an arithmetic sequence, then d = 7, 
and there can be no ambiguity about the 5th number. It is a\ = 27 + 7 = 34. So 
far, so good. Now you know how to exhibit intelligence by the standards of the 
last century. 

What if you were asked to determine, not a\, but «4oo? Using the recurrence 
fl40o = fl399 + 7 is not much help. The key to solving Sequence (4.1) is to think 
of it symbolically, as 

6,6 + 7, (6 + 7) +7, (6 + 7 + 7) +7,... 

From this perspective, it is clear that a„ is a sum of n + 1 numbers, one 6 and n 7's, 
i.e., a n = ln + 6. So, 0400 = 7 x 400 + 6 = 2806. This solution illustrates the ten- 
sion between mathematics and computation. Doing the arithmetic at each step leads 
to 0400 = «399 + 7. Not doing the arithmetic reveals a pattern leading to the math- 
ematical abstraction a n = ln + 6. 

More generally, every arithmetic sequence takes the form 

a 0 , «o + d, «o + 2J, «o + 3d, . . . 
So, the nth term of an arithmetic sequence (the (n+ l)st number in the sequence) is 

a n = dn + a Q . (4.2) 

An expression like Equation (4.2), in which a n is given as an explicit function of n, 
is called a closed formula, or solution, for {a n }. 

Associated with the sequence {a n } is a natural function of the nonnegative inte- 
gers, namely, /(«) = a n , n>0. Conversely, to any function / of the nonnegative 
integers, there corresponds a natural sequence, namely, {/(«)}. Informally, a closed 
formula for {a n } is a "nice" description of the corresponding function, e.g., {a n } is 
arithmetic if and only if it corresponds to a function of the form f{n) = dn + clq, 
i.e., to a polynomial of degree (at most) 1. 

*Do high IQ scores correlate best with an ability to recognize patterns, an ability to choose most plausible 
patterns, or an ability to read the minds of the authors of the tests? 
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Consider the sequence {n 2 }, i.e., 

0,1,4,9,16,25,... 

It is not arithmetic. For one thing, the closed formula /(w) = n 2 is a nonlinear poly- 
nomial. For another, while a n+ \ is obtained from a n by adding an odd number, that 
number changes. The difference, a n+ \ — a n = (n + l) 2 — n 2 = 2n + 1, is not 
constant. 

4.1.3 Definition. Let {a n } be a fixed but arbitrary sequence. Its difference 
sequence, denoted {Aa n }, is defined by Aa n =a n+1 —a„, n>0. 

Perhaps A(a„) would be a better notation. Certainly, Aa n should not be confused 
with a product of A and a n . Whatever the notation, {a n } is an arithmetic sequence if 
and only if its difference sequence {Aa n } is constant, that is, Aa n — d, n > 0. 
When a n = n 2 , Aa n = 2n+l. In other words, {Am 2 } = {In + 1}. 

If /(«) = a „, n > 0, then Aa n = Af(n) =f(n + 1) -f{n). It seems that 

Art.) = /( " + '»- /W (4.3) 

is a kind of discrete derivative. 

It can be revealing to look at a sequence and its difference sequence {also called 
sequence of differences) side by side. In the case of {n 2 }, the side-by-side compar- 
ison looks like this: 

0, 1, 4, 9, 16, 25, 36, 49, ... 

1, 3, 5, 7, 9, 11, 13, ... 

Evidently, the difference sequence of the sequence of perfect squares is the 
sequence of odd numbers. More useful to our present objective is the fact that 
the difference sequence is arithmetic. This suggests looking at the difference 
sequence of a difference sequence. The following difference array gives two 
generations of difference sequences for {n 2 }: 

0, 1, 4, 9, 16, 25, 36, 49, ... 

1, 3, 5, 7, 9, 11, 13, ... 

2, 2, 2, 2, 2, 2, ... 

Denote by {A 2 a„} the difference sequence of the difference sequence, Then, 
e.g., {A 2 n 2 } = {2}, the constant sequence each of whose terms is 2. In general, 



A a n — Aa n+ \ - Aa n 

= a n+2 ~ 2a„+i + a„, 



(4.4) 
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A 3 a Q , 


A 3 a h 


A 3 a 2 , 


A 3 a 3 , 









Figure 4.1.1. A generic difference array. 

Letting A°a„ = a n and A'a„ = Aa n , we can define A r+1 a n = A(A r a n ) for all 
r > 1, i.e., 

A r+l a„ = A r a„ +l - A r a n , r>\. (4.5) 
Successive generations of difference sequences are displayed in Fig. 4.1.1. 
4.1.4 Example. The difference array for {n 3 } is 

0, 1, 8, 27, 64, 125, 216, 343, ... 

1, 7, 19, 37, 61, 91, 127, ... 
6, 12, 18, 24, 30, 36, ... 

6, 6, 6, 6, 6, ... 

While one could write out additional rows, there isn't much point in doing so. If the 
fourth row, corresponding to {A 3 w 3 }, is constant, then each row after the fourth 
consists entirely of zeros. But, is the fourth row really constant? Let's see. 

If {«„} is any sequence, then Aa n = a n+ \ — a n . From Equation (4.4), A 2 a„ = 
a„+2 — 2a n +i + a n . From Equation (4.5), 

A 3 a n = A : (i„ . , - A 2 a n 

= ( a n+3 — 2a n+2 + a n+ i) — (a n+2 — 2a n+l + a„) 

— a„+3 - 3a n+2 + 3a n+ i - a„. (4.6) 

Substituting a n ~n 3 into Equation (4.6) yields 

AV = (n + 3) 3 - 3(n + 2) 3 + 3(n + l) 3 - n 3 

= {n 3 + 9n 2 + Tin + 27) - 3(n 3 + 6n 2 + \2n + 8) + 3(n 3 + 3n 2 + 3n + 1) - n 3 
= 6 

for all n. □ 

Is it too early to guess a pattern? Might { A 4 a n } be constant when a n = m 4 ? More 
generally, might {A r a n } be constant when {a n } — {n r }. If so, can the constant be 
predicted in advance? Before we can answer such questions, we need to know a 
little more about {A r a n }. 
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4.1.5 Lemma. If {a n } is a sequence then, for all n > 0, 



A r a n = Y^{-l) r+, C{ r ,t)a n 



Proof. The identity has already been established for small r (see, e.g., Equations 
(4.4) and (4.6)). From Equation (4.5) and induction on r, 

A r+l a n = A r a n+ i - A r a n 

= £(-ir +t C(r, t)a n+l+t - £(-l) r+ 'C(r, t)a n+t 

r+l 



= J2(-iy +> - l c( r , * - + E(- 1 ) r+ ^' c ('"' f H+' 

t=\ t=o 

r 

= a n+r+l + ^(-l)^ 1 [C(r, t - 1) + C(r, f)K+* + (-l) 1 ^ 
(=i 

= E(-l) r+1+ 'C(r+l,fK+,. 



With the help of Lemma 4.1.5, we can answer our questions about {A r n r }. 

4.1.6 Theorem. Suppose r is a fixed but arbitrary positive integer. Let a n = n r , 
n > 0. Then A r a n = r\,n> 0. 

Proof. By Lemma 4.1.5, 

A r n r = jT(-iy +t C( r ,t)(n + ty 

= ^(-iy +, c(r,t)Y / c(r, m y- m f» 

t=0 m=0 

= J2c(r,my- m £(-iy + 'C(r,t)f n 

m=0 f=0 
r 

= C ( r > m)n r - m r\S(m, r) 

m=0 



by Stirling's identity. Because the Stirling number of the second kind, S(m,r), is 
equal to 0 when m < r and equal to 1 when m = r, the only surviving term in 
the final summation is C(r, r)n r ~ r r\ = r\. ■ 
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4.1.7 Corollary. Suppose m is a fixed but arbitrary positive integer. Then 
A r+l n m = Ofor alln>0 and all r > m. 

Proof. From Theorem 4.1.6, A m+l n m = A(A m n m ) = Ami = ml - ml = 0. If 
r > m, then A r+l n m = A r - m {A m+l n m ) = A'- m 0 = 0. ■ 

Corollary 4.1.7 remains valid when n m is replaced by any polynomial in n of 
degree m. 

4.1.8 Theorem. Let m be a fixed but arbitrary positive integer. Suppose f is a 
polynomial of degree m. If a n = f(n), n > 0, then A r+l a n = Ofor all n > 0 and all 
r > m. 

Proof. Suppose {y n } and {z„} are sequences. Let b and c be numbers. Then 

A(by n + cz n ) = (by n+ i + cz„ +1 ) - (by n + cz„) 

= - yn) + C(Z„+1 - Zn) 

= b Ay n + cAz n - 

So, A is linear. Therefore, 

A 2 (by n + cz n ) = A(A(by n + cz„)) 
= A{bAy n + cAz n ) 
= b A 2 y n + c A 2 z„, 

and, more generally, A k {by n + cz„) = b A k y„ + cA k z n for all k > 1. If f(x) = 
cqx" 1 + cix" 1-1 + ■■■ + c m and a n =f(n), n > 0, then 

A r+1 a n = A r+1 f(n) 

= A r+l (c () n m + c l n m - 1 + --- + c m ) 

= c Q A r+x n m + Cl A r+l n m - 1 + • • • + c m A r+1 (1) 

= 0 

by linearity and Corollary 4.1.7. ■ 

4.1.9 Example. Consider the sequence {a n } the first few terms of which are 



1,6, 15,28,45,66,91,. 
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Successive terms of this fragment of a sequence differ by 5, 9, 13, 17, 21, and 25, 
respectively. If this pattern were to continue, the difference sequence would be 
arithmetic, with Aa„ = 4n + 5, n>0, and the second difference sequence would 
be {4}. This is consistent with the nth term of the (original) sequence being of 
the form f(n)= an 2 + bn + c. Substituting n = 0, 1, and 2, respectively, yields the 
linear system 

c= 1 
a + b + c = 6 
4a + 2b + c = 15 

which has the unique solution a = 2, b = 3, and c= 1. Computations confirm that 

On =/(«) 

= 2n 2 + 3n+l, 0<«<6. □ 

Some interesting questions are suggested by Example 4.1.9: (1) Is the converse 
of Theorem 4.1.8 always true? (2) If so, is there some easy way to find the poly- 
nomial function /, short of solving a system of linear equations? The answers to 
these questions are yes and yes. To see why, consider the n = 0 case of Lemma 
4.1.5, i.e., 

A r a Q = J2(-l) r+, C(r,t)a t . 

t=o 

Multiply both sides of this equation by C(n, r) and sum on r to obtain 

]T C(n, r) A r a Q = ]T ]T(-l) r+ 'C(«, r)C(r, t)a t . 

Because C(r, t) = 0 when t > r, we can let the second sum on the right-hand side 
run from t = 0 to t = n. That makes it easy to reverse the order of the summations 
so as to obtain 

J2 C(n, r) A r a Q = ^a t £(-l) r+ 'C(n, r)C(r, t) 

r=0 t=0 r=0 

n 

(=0 

- On (4.7) 

by the alternating- sign theorem for inverting Pascal matrices.* 

'Strictly speaking, we have used an extension of the alternating-sign theorem found in Exercise 25, 
Section 1.5. 
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For the sequence fragment in Example 4.1.9, A°a 0 = a Q = I, A'ao = Aa 0 = 5, 
A 2 ao = 4, and A r ao = 0 for all r > 3. Thus, according to Equation (4.7), 

a n = C(n, 0) x 1 + C(n, 1) x 5 + C(n, 2) x 4 
= l + 5n + 4n(n - l)/2 
= 2m 2 + 3m+1, 

precisely the polynomial obtained in the example by solving a system of linear 
equations. 

If/(n) = a„, n > 0, then A r a n = A r f(n), r,n>0. In particular, A r a Q = A r f(0) 
for all r > 0. Hence, by Equation (4.7), 

/(») = £c(»,r)A'/(0) 

r=0 

because C(n, r) = w' r ' /r!. Since «' r ' = 0, r > n, this last equation can be expressed 
in the form 

m -£*m*>, ,4.8, 

r=0 

a discrete analog of the Maclaurin series from calculus. 

Iff happens to be a polynomial of degree m, a combination of Theorem 4.1.8 
and Equation (4.8) yields that 

r=0 

Conversely, if {A m a n } is constant, so that {A r a n } = {0} for all r > m, then 
^ r /(0) = A^flo = 0, r > m, and Equation (4.8) becomes 

Since M' r ' is a polynomial (in n) of degree r, Equation (4.9) implies that /(«) is a 
polynomial of degree at most m. (If { A m a n } is a nonzero constant, / is a polynomial 
of degree exactly m.) This proves the following strong converse of Theorem 4.1.8. 



'After Colin Maclaurin (1698-1746). 
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4.1.10 Theorem. Let {a n } be a sequence. If the mth difference sequence 
{A m a n } is constant, i.e., ifA m+l a n = 0 for all n > 0, then there exists a polynomial 
f of degree at most m such that a n = /(«) for all n > 0. Moreover, 

m 

f(n) = J2c(n,r)A r ao. (4.10) 

r=0 

Proof. Equation (4. 10) follows either by replacing w' r ' / r\ with C{n, r) in Equation 
(4.9) or by replacing a„ with f(n) in Equation (4.7). ■ 

Theorem 4.1.10 is a "strong" converse of Theorem 4.1.8 because it does more 
than establish the existence of /. Equation (4.10) is an explicit formula; it is the 
"easy way" to find / (short of solving a linear system of equations). Note, in par- 
ticular, that if {A m a n } is a constant sequence then /, hence {«„}, is completely 
determined by the m + 1 numbers oq, Aoq, . . . , A m ao from the first column (or 
leading edge of the difference array for {«„}. 

4.1.11 Example. Suppose {a n } is a sequence the first column of whose differ- 
ence array is 1,5, 4, 6, with zeros thereafter. Compute aioo- Solution: Let/(«) = a n , 
n>0. Because A^o^O, r>4, Equation (4.10) yields 

3 
r=0 

= C(n, 0) x 1 + C(n, 1) x 5 + C(n, 2) x 4 + C(n, 3) x 6 
= 1 + 5m + 4«(m- l)/2 + 6n(n- l)(n-2)/6 
= 1 + 5m + 2n 2 - 2n + n 3 - 3n 2 + 2n 
= m 3 - n 2 + 5n + 1, 

so fl 100 =10 6 -10 4 + 500+l = 990,501. □ 

4.1.12 Example. Let m be a fixed positive integer and {a„} be the sequence 
whose nth term is a n =n m , n>0. From Equation (4.10) (and Corollary 4.1.7), we 
obtain 

m 
r=0 



On the other hand, from Corollary 2.2.3, 

m 

n m = ^ j r\S(m,r)C(n,r), 



262 



Generating Functions 



0, 
1, 
14, 
36, 
24, 
0, 



1, 
15, 
50, 
60, 
24, 

0, 



16, 
65, 
110, 
84, 
24, 
0, 



81, 
175, 
194, 
108, 

24, 



256, 
369, 
302, 
132, 



625, 
671, 
434, 



1296, 
1105, 



2401, 



Figure 4.1.2. The difference array for {n 4 }. 



where S(m,r) is a Stirling number of the second kind. Together with the fact that 
C(n, r) = «' r ' /rl, these equations imply that r\S(m, r) = A r «o, 0 < r < m. (See Exer- 
cise 17, below.) The numbers comprising the leading edge of the difference array 
for the sequence {n m } are A r ao = r\S(m,r), r>0. 

Let's check it out. For m = 4, 0LS(4,0) = 1x0 = 0, 1!5(4, 1) = 1 x 1 = 
1, 2!5(4,2) = 2 x 7 = 14, 3!S(4,3) = 6 x 6 = 36, 4!5(4,4) = 24 x 1 = 24, and 
5!5(4, 5) = 120 x 0. Compare the sequence 



with the first column of the difference array for {n 4 } shown in Fig. 4.1.2. □ 

4.1.13 Example. Perhaps the techniques of this section can be made to yield 
additional new insights about Stirling numbers of the second kind. Consider, e.g., 
the sequence 



where k is fixed but arbitrary. (The previous example involved S(m,r) where m was 
fixed. This time, m — r = kis fixed.) When k = 2, the first few terms of the sequence 
are 



The initial portion of the difference array for this sequence is illustrated in 
Fig. 4.1.3. If the fourth difference sequence, corresponding to the fifth row of the 
difference array, really is the constant sequence {3} then, from Equation (4.10), 
there is some polynomial fa of degree 4 such that S(2 + n,n) =f2(n) for all 
n > 0. Moreover, from the leading edge of Fig. 4.1.3, 



f 2 (n) = C(n, 1) + 5C(n, 2) + 7C(n, 3) + 3C(n, 4) 

= [C(n, 1) + C(n, 2)] + 4[C(n, 2) + C(n, 3)] + 3[C(n, 3) + C(n, 4)] 
= C(n + 1,2) + 4C(n +1,3) + 3C(« +1,4) 
= [C(« + 1,2) + C(n + 1, 3)] + 3[C(n + 1,3) + C(n + 1,4)] 
= C(« + 2,3) + 3C(« + 2,4). 



0, 1, 14, 36, 24, 0,... 



S(*,0), 5(^+1,1), 50 + 2,2), 50+3,3),..., 



0, 1, 7, 25, 65, 140, 266, 462,... 
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0, 


1, 


7, 


25, 


65, 


140, 


266, 


1, 


6, 


18, 


40, 


75, 


126, 


196, 


5, 


12, 


22, 


35, 


51, 


70, 




7, 


10, 


13, 


16, 


19, 






3, 


3, 


3, 


3, 









Figure 4.1.3. Difference array for {S(2 + n,n)}. 



Can this be right? Does S(w + 2,w) = C(w + 2,3) + 3C(n + 2,4) for all n>0? (See 
Exercise 23.) If so, what about S(n + 3,n)l (See Exercise 24.) □ 

4.1. EXERCISES 

1 Compute A497 if {««} is an arithmetic sequence satisfying 

(a) flo = 1 and ci\ = 4. 

(b) «2 = 76 and a\ = 80. 

(c) «46i = 1860 and 12462 = 1864. 

2 Equation (4.2) expresses the rath term of an arithmetic sequence {a n } in terms 
of ao, n, and d. Some people prefer to denote the first number in a sequence, 
not by «o, but by a\. This system has the advantage that the rath number and 
the wth term of the sequence are both a n . If an arithmetic sequence begins with 
a\ and satisfies a n+ \ = a n + d, n > 1, give a formula for a„ in terms of a\, n, 
and d. 

3 Let {«„} be the sequence 3, 4, 9, 18, . . . defined by ao = 3 and a n+ \ = a n + 
4n + 1, n > 0. 

(a) Compute 04, 05, . . . ,a%. 

(b) Compute Aa„. 

(c) Starting with a first row consisting of the nine numbers ao, a\, . . . , a%, 
exhibit the rest of the difference array for {«„}. 

(d) Prove that a n = 2n 2 - n + 3, n > 0. 

(e) Let {b n } be the sequence defined by Ab n = a„,n> 0. Find a polynomial g 
such that b n = g(n) for all n > 0. 

4 Let {a n } be the sequence 3,4,8,17,..., where ao = 3 and a n+ i = 
a„ + (n + l) 2 , n > 0. Find a polynomial / such that a„ = f(n), n > 0, by 

(a) using Equation (4.10). 

(b) writing the sequence in the form 

3,3+ 1 2 ,3+ l 2 + 2 2 ,... 



and using the formula for the sum of the squares of the first n positive integers. 
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5 Let {a n } be the sequence 1, 2, 4, 8, ... , where oq = 1 and a n+ \ = 2a n , n > 0. 

(a) Exhibit the difference array for {a n }. 

(b) Prove that there does not exist a polynomial / such that a n = f(n) for all 
n > 0. 

6 Recall that the Fibonacci sequence {F n } is defined by F$ = F\ = 1 and 
F„ + i =/ ? „ + F„_i, n > 1. Prove that there is no polynomial / such that 
F„ =f(n) for all n > 0. 

7 Let {«„} be an arithmetic sequence. Prove that the sum of the first k numbers 
in the sequence is given by the formula 

a 0 + «i H h a k -\ = (a Q + a k -\)k/2. 

8 Compute the sum of the first 100 numbers (a 0 + «i + ■ • • + 099) of the 
arithmetic sequence {a n } that begins 

(a) 1,2,3,... (b) 1,3,5,... 

(c) 2,4,6,... (d)7,ll,... 
(e) 7, 13,... (f) 123, 133,... 

9 Let {a„} be a sequence that satisfies A m+l a n — 0, n > 0. Prove that the sum of 
the first k numbers in the sequence is given by the formula 

yfc-l m 
«=0 r=0 

10 Show that the formula in Exercise 7 is the m = 1 case of the formula in 
Exercise 9. 

11 Let {a n } be the sequence 3, 4, 9, 18, . . . defined by «o = 3 and a n+ \ = a n + 
4n+ 1, n > 0. 

(a) Show that A°a 0 = 3, A'ao = 1, A 2 a 0 = 4, and A'a„ = 0 for all t > 3 and 
all n > 0. 

(b) Confirm the = 4 case of the formula in Exercise 9 for this sequence by 
showing that 3 + 4 + 9 + 18 = 3 x C(4, 1)+ 1 x C(4, 2) + 4 x C(4, 3). 

(c) Compute the sum 3 + 4 + 9+ 18 + -- -+ 123 by filling in the missing 
entries (indicated by the ellipsis) and doing all the additions. 

(d) Compute the sum 3 + 4 + 9+ 18 + ••• + 123 by first using Exercise 3(a) 
to deduce that 123 = a% and then using Exercise 9. 

(e) Show that 3+4 + 9+ 18 + ••• + 1131 = 9575. 

(f) Compute the sum 3 + 4 + 9+ 18 H h 19, 506. 
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12 Let {a„} be the sequence 3,4, 8, 17, . . . from Exercise 4. Use Exercise 9 to 
compute the sum ao + a.\ + ■ ■ ■ + a^. 

13 If a„ — n m for some fixed positive integer m, then Exercise 9 yields the formula 



Use this formula to find a polynomial / such that 

(a) f(k) =l 2 + 2 2 + ---+k 2 . 

(b) f(k) = l 3 + 2 3 + --- + k\ 

14 Each of the following is a special case of Equation (4.6) applied to the 
sequence {n 3 } from Example 4.1.4. Give a direct, computational confirmation 
that 

(a) 5 3 - 3 x 4 3 + 3 x 3 3 - 2 3 = 6. 

(b) 6 3 - 3 x 5 3 + 3 x 4 3 - 3 3 = 6. 

(c) 7 3 - 3 x 6 3 + 3 x 5 3 - 4 3 = 6. 

(d) 8 3 - 3 x 7 3 + 3 x 6 3 - 5 3 = 6. 

15 Use the approach illustrated in Example 4.1.12 to compute the Stirling 
numbers 

(a) 5(3, n), 1 < n < 3. (b) 5(5, n), 1 < n < 5. 

16 Let {a n } be a sequence. Prove that there is a polynomial / such that a n =f(n), 
n > 0, if and only if the terms of the sequence satisfy a recurrence of the form 
a n+ \ = a„ + g(n), n > 0, where g is a polynomial. 

17 Let m be a fixed positive integer. Prove that {x' r ' /H : 0 < r < m} is a basis 
for the vector space of polynomials of degree at most m, where x' r ' is the 
falling factorial function. 

18 Let IR be the set of real numbers. If / : R -> U is a function, let A/ : IR -> R 
be its "discrete derivative", i.e., Af(x) =f(x+ 1) —f(x). 

(a) Prove that Ax' m ' = mx' m_1 ', where x' m ' is the falling factorial function. 

(b) Prove that A2 X = 2 C . 

(c) Find an analog for discrete differentiation of the "product rule" for 
ordinary differentiation. 

19 The sequence {«„} is said to be convex if 



in 



1'" + 2 m + ■ ■ ■ + k m = C(k + 1, r + 1) A r a 0 . 



r=0 



a n+ 2 + a, 
2 



hi 



n > 0. 



266 



Generating Functions 



(a) Show that {a n } is convex if and only if each term of its second difference 
sequence {A 2 a n } is nonnegative. 

(b) Compare and contrast part (a) with the theorem from calculus that / is 
concave up on the open interval / whenever its second derivative 
f"(x) > 0 on /. 

(c) Let p(n) be, not the value of a polynomial function at x = n, but the 
number of partitions of n. Show that the sequence {a n } defined by 
a„ = p(n + 1), n > 0, is convex. 

20 Let r be a fixed positive integer. Define a n = C(n,r), n>0. Find a closed 
formula for Aa„. 

21 Suppose a n = /(«), n > 0, where f(x) = b r x r + b r _\x r ~ x + • • • + b x x + b Q . 

(a) Show that u = Q r v, where u = (oq, a\, . . . , a r )', the transpose of 
(a Q ,a u . . .,a r ), v = (b 0 ,b u . . .,b r )', and 



Qr = 



( 1 0 0 

1° l 1 l 2 

2° 2 1 2 2 

3° 3 1 3 2 



r 2 



V 

2 r 
3 r 



(b) Show that 



6 
-11 
6 

V -i 



0 0 

18 -9 

-15 12 

3 -3 



2 

-3 

1/ 



(c) Consider the sequence {a„} whose first few terms are 1,3,8,17, ... . 
Given that A 4 a n = 0, n > 0, use parts (a) and (b) to find a polynomial f(x) 
(of degree at most 3) such that a n =/(«), n > 0. 

(d) Consider the sequence {«„} described in part (c). Use Equation (4.10) to 
find a polynomial /(x) (of degree at most 3) such that a n =f(n), n > 0. 

22 Much of our knowledge of ancient Egyptian mathematics comes from the 
(seventeenth century bc) Rhind papyrus. The papyrus contains the following 

sequence: 7,49,343,2301, . What is the fifth number in the sequence? 

Justify your answer. (Hint: Anyone can make mistakes.) 

23 Prove the identity 

(a) S(n + 2,n) = C(n + 2, 3) + 3C(« + 2,4), from Example 4.1.13. 

(b) S(n + 2,n) = n{n + l)(n + 2)(3n + l)/24. 
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24 Use the technique illustrated in Example 4.1.13 to show that 

(a) S(n + 3,n) = C(n + 3, 4) + WC(n + 3, 5) + \5C{n + 3, 6). 

(b) S(n + 3,n) = n 2 (n + \) 2 {n + 2)(n + 3)/48. 

25 Use Equation (4.10) to express /(w) = 3n 2 + 2n + 1 as a linear combination of 
binomial coefficients. 

26 Express the polynomial /(«) from Exercise 25 as a linear combination of 
falling factorial functions (of n). 

27 Consider the sequence 0,1,3,6,10,15,... whose nth term is S(n+\,n), 
n>0. 

(a) Use the technique illustrated in Example 4.1.13 to show that 
S(n+l,n) = C(n + 1,2). 

(b) Give a combinatorial proof of the identity S(n + 1, n) = C(n + 1,2). 

28 Suppose {a n } is the sequence determined by the initial condition ao — 0 and 
the recurrence a n = a n _\ + n 2 , n > 1. 

(a) Exhibit the difference array for {a n }. 

(b) Use part (a) to find a polynomial / such that a n = f(n), n > 0. 

29 Suppose r and s are nonnegative integers satisfying r > s + 2. Let 
B=(2s+l) + (2y + 3) + --- + (2r-l). 

(a) Prove that n is a difference of squares. 

(b) Prove that n = ab, where a and b are integers (strictly) larger than 1 both 
of which are even or both of which are odd. 

30 Show that there are 10 different ways to express 945 as a sum of (two or more) 
consecutive odd positive integers. {Hint: Exercise 29 and the fact that 
945 = 3 3 x 5 x 7.) 

31 Prove the following identity (attributed to Galileo): 

l_l + 3_ 1 + 3 + 5 l + 3 + -- + (2n-l) 

3^5 + 7^7 + 9+11 ~ (2«+ 1) + ••• + (4m- 1) ~ "' 

32 Suppose n > 1 is a difference of (positive) squares. Prove that n = (2s + 1) 

+(2s + 3) H h (2r — 1), where r and s are nonnegative integers satisfying 

r > s + 2. 

33 Suppose n = ab, where a and b are integers (strictly) larger than 1 both of 
which are even or both of which are odd. Prove that n = (2s+ 1) + 

(2s + 3) H h (2r — 1), where r and s are nonnegative integers satisfying 

r > s + 2. 
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4.2. ORDINARY GENERATING FUNCTIONS 

Consider the sequence 3, 6, 12, 24, ... , where ao — 3 and a n+ \ = 2a„, n>0. Pretty 
clearly, no row of the difference array 

3, 6, 12, 24, ... 
3, 6, 12, 24, ... 
3, 6, 12, 24, ... 

will ever be constant, much less consist entirely of zeros. So, by Theorem 4.1.8, the 
function defined by /(«) = a n , n > 0, is not a polynomial. Indeed, by not doing any 
arithmetic, it is easy to see from the symbolic representation 

3, 3x2, (3 x 2) x 2, (3 x 2 x 2) x 2, . . . 
that/(n) = 3 x 2", n > 0. 

4.2.1 Definition. The sequence {«„} is geometric if it satisfies a recurrence of 
the form a n+ \ — da n , n>0, where d is a constant, independent of n. 

Evidently, the rath term of a generic geometric sequence is given by the closed 
formula a n = ao x d", n > 0. 
Consider the sequence 

3,4,22,46,178,454,... (4.11) 

defined by ao = 3, a\ = 4, and a n = a n -\ + 6a„_2, n > 2. This one is neither arith- 
metic nor geometric. While there is a simple closed formula for a n , its discovery 
requires either an inspired guess or a new approach. 

4.2.2 Definition. The (ordinary) generating function for the sequence {a n } is 

g(x) = a 0 + a x x + a 2 x 2 + a 3 x 3 H (4-12) 

Generating functions come in assorted sizes, shapes, and flavors. The pattern 
inventory Wg(xi,X2, . . . ,x n ) is one kind of generating function; Equation (4.12) 
is another. The name "generating function" is more than a little curious. The 
pattern inventory doesn't generate anything; it is generated by the cycle index poly- 
nomial^ Moreover, as we are about to see, it is useful to view g(x) as something 
other than a function! 

*The subject of Section 3.6. 
tr rhe subject of Section 3.7. 
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If g(x) is the generating function for Sequence (4.11), then 

g(x) = 3 + 4x + 22x 2 + 46x 3 + 178x 4 + ••• + a n x" + ••• 
— xg(x) = — 3x — 4x 2 — 22x 3 — 46x 4 — ■■■ — a n _ix" — ■■■ 
-6x 2 g{x) = - 18x 2 - 24x 3 - I32x 4 - ■■■ - 6a n - 2 x" 

Summing these three equations produces 

g(x)(l — x — 6x 2 ) = 3 + x. 
(The recurrence guarantees that [a n — a„_i — 6a n -2]x n = 0, n > 2.) Evidently, 

g(x) = 3 + 4x + 22x 2 + 46x 3 + 178x 4 + 454x 5 + • • • (4.13a) 

= T^-' < 4 ' 13b > 

A typical backpacker will sacrifice many things to decrease weight. Freeze-dried 
food is a good example. Why carry water (even as a constituent of food) if it is 
available at campsites? Equation (4.13b) might be viewed as a freeze-dried version 
of Equation (4.13a). (If you had to stuff g(x) into a backpack, which version would 
you prefer?) 

Okay. Imagine yourself at a campsite. What is the easy way to resurrect (or 
generate) the sequence {a n } from g(x) = (3 + x)/(l — x — 6x 2 )? One perfectly 
acceptable alternative is long division. Another is to factor the denominator as 
(1 + 2x)(l - 3x), so that 



g(x) = (3+x) 



1 \ / 1 



1 + 2x1 VI -3x 



Recall that 



so 



1 -x 



= l+x + x 2 +x 3 +x 4 + ---, (4.14) 



' 1 + (-2k) + {~2xf + (-2x) 3 + ■ ■ ■ (4.15) 



1 +2x 



and 



! = l+3x+(3x) 2 + (3x) 3 + ---. (4.16) 



1 -3x 

Therefore, g(x) can be expressed as the (formidable looking) product 



g(x) = (3 + x){\ - 2x + 4x 2 -8x 3 + ■■■)(! + 3x + 9x 2 + 21x 3 + •••)• 
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A third, easier approach is to make use of the method of partial fractions*, i.e., to 
write 

3+x 3 + x _ 1 2 

gW ~ 1 - x - 6x 2 ~ (1 + 2x)(l - 3xj ~ 1 + 2x + 1 - 3x ' 

Together with Equations (4.15) and (4.16), this yields 

g(x) = [1 + (-2x) + (-2x) 2 + •••]+ 2[1 + 3x + (3x) 2 + • • •] 

= [1 - 2x + 4x 2 - 8x 3 + • • •] + [2 + 6x + 18x 2 + 54x 3 + • • •] 
= 3 + 4x + 22x 2 + 46x 3 H , 

and the generating function has been reassembled. There is more. Obscured by the 
rush to compute is a closed formula for a n . Comparing the coefficients of x" in 

g(x) = «o + a\x + ci2X 2 + fl3X 3 + • • • 

and 

g(x) = [1 + (-2x) + (-2x) 2 + •••]+ 2[1 + 3x + (3x) 2 + • • •] 

yields 



a. 



„ = (-2)" + 2(3"), n>0. (4.17) 



It is striking, but is it right? Without checking for convergence, what justifies 
manipulating the generating "function" just as if it were an honest-to-goodness 
function? It would appear that our derivation may have some holes in it. On the 
other hand, independently of where it came from, we can prove that Equation 
(4.17) is a valid identity. 

Define a sequence {b n } by b n = 2(3") + (-2)", n > 0. Then b Q = 2(3°)+ 
(—2)° = 3 = do and b\ = 2(3) — 2 = 4 = a\. So, the first two numbers in the 
sequences {a n } and {b n } are the same. If we could prove that the sequences satisfy 
the same recurrence, i.e., if b n — b n -\ + 6b„_2, n > 2, it would follow that b n = a n 
for all n. 

Observe that 

2(3") = 6(3"-') = 2(3"" 1 ) +4(3"" 1 ) = 2(3" _1 ) + 6[2(3"~ 2 )] 

and 



(-2)" = -2(-2)" _1 = (-2)"- 1 - 3(-2)" _1 = (-2)"" 1 + 6(-2) 



n-2 



*You already know how to do partial fractions. If you don't recall all of the details, that's okay. It just 
means you will have to dig out your old calculus book and do some reviewing. 
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Summing the extreme left- and right-hand sides, we obtain 

b n = 2(3") + (-2)" 

= [2(3"" x ) + (-2)"- 1 ] + 6[2(3"- 2 ) + (-2)"- 2 ] 
= b n -\ + 6b„_ 2 - 

(Before reading on, confirm that bs = 2(3 5 ) — 2 5 = 454 = 135.) 

A spelunker is someone who explores caves. Of the many things a spelunker 
must do well, perhaps the most important is to keep track of where s/he is relative 
to the way out. Let's pause and outline where we are. We used the sequence {a n } to 
produce a generating function g(x) = a 0 + a\x + a 2 x 2 + ■ ■ ■ .On one level, the plus 
signs and powers of x are separators. Like the commas in ao, a\, 02, ■ ■ ■ , they keep 
the a,'s apart. On a deeper level, just writing g(x) suggests manipulating it as if it 
were a function. (As Leibniz once observed, good notation can lead to startling 
insights.) The object of manipulating g(x) was to produce a closed formula 
(freeze-dried version). The closed formula gave us another way to look at {«„}, 
eventually leading to a solution for a„. The disturbing part came at the end, where 
it seemed necessary to validate the solution. One way to avoid this verification step 
would be to justify the algebraic manipulations leading up to it. 

4.2.3 Definition. A formal power series in x is an infinite sum of the form 
ao + a\x+a2X 2 + aj,x 3 H , where the coefficients ao,a 1 ,a 2 ,a3,. . . are fixed con- 
stants. It is sometimes convenient to give a shorthand name to a power series, 
writing, e.g., 

g(x) = «o + a\x + «2-* 2 + «3-£ 3 + • • • 
«>o 

(The expressions X)«>o fl « x " an d X^o 0 "*" are interchangeable.) 

Most of the algebraic manipulations associated with polynomials extend natu- 
rally to formal power series. (If all but finitely many of its coefficients are zero, 
a formal power series is a polynomial.) If 

f( x ) = anX " and g( x ) = Y1 bnX "> 

«>0 «>0 

then f(x) = g(x) if and only if a n = b n for all n > 0. If c and d are constants, then 
h(x) = cf(x) + dg(x) is the formal power series defined by 



h(x) = c ^ a„x" + d ^ b n x n = ^^(ca„ + db n )x n . 

n>0 n>Q «>0 



(4.18) 
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Multiplication of polynomials also extends to formal power series: 

(a 0 + a\x + a 2 x 2 + ■ ■ ■)(b Q + b\X + b 2 x 2 + • • •) 

= aobo + (a 0 bi + a\b a )x + (a a b z + a\b\ + a 2 bo)x 2 -\ 

In general, 



j2<*nA \J2 b " x ") =E c « x "' ( 4 - 19a ) 

\n>0 / \«>0 / «>0 



where 



c„ = J2a r b„- r . (4.19b) 

r=0 

4.2.4 Example. Observe that 

(1 +x + x 2 +x 3 +x 4 + -x) = \. (4.20) 

In fact, this product is just a variation of Equation (4.14). □ 
It is instructive to turn Example 4.2.4 around. How do we know that 

! 1 + x + x 2 + x 3 + x 4 H ? 



l-x 

One justification comes from calculus: 



g(x) = l+x + x 2 + x 3 +x 4 + ••• 

= lim 1 +x + x 2 H h/ _1 

,. l-x" 
= lim 

moc 1 — x 
1 



l-x 1 



x G (—1, 1), because lim x" = 0 whenever \x\ < 1. But, this argument depends 
upon viewing g(x) = I + x + x + x + x + ■ ■ ■ as a function, precisely the per- 
spective we are trying to avoid. What we want is a justification that depends 
only on the algebra of formal power series. 



4.2.5 Definition. Let g(x) and h(x) be formal power series. If g(x)h(x) — 1, then 
h(x) is the reciprocal of g(x), written h(x)= l/g(x). 
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Because multiplication of power series is commutative, h(x) is the reciprocal of 
g(x) if and only if g(x) is the reciprocal of h(x). 

4.2.6 Theorem. The formal power series g(x) — X)n>o fl n*" has a reciprocal if 
and only if ao ^ 0. If g(x) has a reciprocal, it is unique. 

Proof. Suppose g(x) has a reciprocal, say h(x) — J2 n >o bn^- Then, from Defini- 
tion 4.2.5 and Equations (4.19a)-(4.19b), cq = a^bo = T, so ao ^ 0 and bo = l/ao 
is uniquely determined by ao- Furthermore, because 

n 

C n = O-rbrt-r = 0, H > 1, 

r=0 

the coefficients b\ = —a\bo/ao, b 2 = — {a\b\ +fl2^o)/flo, and so on, are uniquely 
determined (recursively) by {«„}. 

Conversely, if a 0 ^ 0, define {b n } recursively by b Q = l/«o> and 

n 

b„ = - y^a r b n - r /ao, n>\. 

r=l 

Then, setting h{x) = X)„>o b n x n , our definitions yield 

n 

^ a r b„- r = ?>„,(), 

r=0 

i.e. (by Equations (4.19a)-(4.19b)), g{x)h{x) = 1. ■ 

Every step in the derivation of Equation (4.17) can now be justified using (only) 
algebraic manipulations of formal power series. The solution a n = (—2)" + 2(3") 
does not require the generating function for {«„} to be a function. We are on solid 
ground again. 

The freeze drying of generating functions can involve a variety of techniques. 
No single recipe works in every case. All by itself, the method of partial fractions 
is pretty much limited to sequences {a n } that satisfy so-called homogeneous linear 
recurrences, i.e., recurrences of the form 

a n = c\a n -\ + c 2 a n - 2 H h c*a B _jfc, n>k, ( 4 -21) 

where A: is a fixed positive integer, and c\, C2, ■ ■ ■ , are constants, independent of n. 
The following technical observation will be useful in helping to motivate the devel- 
opment of a useful tool. 
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4.2.7 Lemma. Iff(x) is the generating function for {«„}, then g(x) =/(*)/ 
(1 — x) = f(x)[l/(l — x)\ is the generating function for {s n }, where s„ = «o+ 
flH h a„. 

Proof. From Equation (4.20) and the definition of reciprocals, 

' 1 + x + x 2 + x 3 + x 4 H 



1 -x 



Therefore, from Equations (4.19a)-(4.19b), 



\«>0 / \«>0 / n>0\r=0 / 



X". 



4.2.5 Example. For a fixed but arbitrary positive integer m, let g,„(x) be the gen- 
erating function for the sequence {s(m,n)} of Stirling numbers of the first kind. 
Because s(m,n) = 0 when n — 0 or n>m, it follows from Theorem 2.5.4 that 



8m{x) = ^2s(m,n)x n 

n=l 

= x(x+ l)(x+2)---(x + m- 1). 

Differentiating with respect to x, we obtain (by the product rule) that 

i v^Jt(JC+l)(Jt+2)-"(x + m-l) 
E Mi ( m ' = E 7X7 • 



n=l i=0 



Setting x= 1 and dividing both sides by m! yields 

1 m 11 1 

Ens(ro.n) = - + -+■■■ + -, (4.22a) 



an identity with some interesting implications. 

The harmonic sequence {h n } is defined by ho = 0 and 



1 1 1 

K = - + -+---+-, n>0. 
12 n 
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Because s(m,n) is the number of permutations in S m whose disjoint cycle factoriza- 
tions consist of exactly n cycles, the left-hand side of Equation (4.22a) is the aver- 
age, over p € S m , of the number of cycles in p. That this average should equal h m is 
unexpected. There is more! By Theorem 2.5.2, h m = s(m+ l,2)/m!. Together with 
Equation (4.22a), this yields 



^ ns(m, n) = s(m + 1 , 2) , (4.22b) 



another surprising result. 

Consider the harmonic generating function 

h{x)=^h„x" 

«>o 

*j(n+ 1,2) 



= E 



«>o 



By Lemma 4.2.7, the formal power series h(x) —f(x)/(l —x), where 
f(x)=x+ l -x 2 + l -x^ + l -x 4 + --- 

If this expression defined, not only a function, but a differentiable function, then 

f'(x) = 1 +x + x 2 + x 3 H 

1 

~ 1 -x 

Antidifferentiating this equation yields (because f(Q) = 0) f(x) = — ln(l — x), from 
which it follows that 

h( X )= - in y- x) . □ 

1 - X 

Can one do calculus with formal power series without treating them as func- 
tions? In a superficial sense, that is not a problem. One can define the formal 
term-by-term derivative of g(x) — J2n>o a « x " by 



D x g{x) = ^2 nanX " 1 
«>i 



276 



Generating Functions 



and, using Equations (4.18)-(4.19b), prove the usual formulas for differentiating 
sums and products. The sticky part comes when we want to differentiate both sides, 
e.g., of 

(1 -x)~ l = l+x + x 2 +x 3 +x 4 + ■■■ (4.23) 

The (ordinary) derivative of the left-hand side is D x (l — x) _1 = (1 — x)~ 2 . The 
formal derivative of the right-hand side is 

D x {\ + x + x 2 + x 3 + x 4 H ) = 1 + 2x + 3x 2 + Ax 3 H 

Despite using the same symbol, D x , for both operators, setting these two derivatives 
equal cannot be justified by arguments based solely on algebraic manipulations of 
formal power series. The justification relies on the fact that the right-hand side of 
Equation (4.23) has a positive radius of convergence. We need the following result 
from calculus. 

4.2.9 Theorem. Let r be a positive real number. If the power series 
a Q + a\X + a 2 x 2 + • • • + a n x n -\ 

converges to g(x) for all x in the interval I = (— r, r), then g is differentiable on I, 
and the power series 

D x g(x) = a\ + 2a 2 x + 3«3X 2 + • • • + na„x"~ l -\ 

converges to g*{x) for all Moreover, for all x£/, 

+ —^—a n x n+l + ••• 
n + 1 



f x 11 

/ g(t)dt = a a x + -aix 2 +-a 2 x 3 H 

Jo 2 3 



4.2.10 Example. Consider the sequence 



2,4,31,100,421,..., 



defined by ao — 2, a\ =4, a 2 = 3l, and a n+ \ =4a„ + 3a„_i — 18a„_2, n>2. Let's use 
generating functions to solve for a„: Summing the equations 



g(x) = 2 + 4x + 3lx 2 + 

-4xg(x) = - 8x - 16x 2 - 

— 3x 2 g(x) = — 6x 2 ~ 

18x 3 g(x) = 



100x 3 + ••• + a„x" + ••• 

124x 3 - ••• - 4a„_ix" - ••• 

12x 3 - ••• - 3a n - 2 x" - ••• 

36x 3 + ••• + 18a„_ 3 x" + ••• 
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produces (1 — 4x — 3x 2 + 18x 3 )g(x) = 2 — 4x + 9x 2 . (The recurrence guarantees that 
[a n — 4a„-i -3a„_2 + 18fl„-3]x" = 0 for all n>3.) So, 

2 - 4x + 9x 2 

Six) = 



1 -4x-3x 2 + 18x 3 

2 - Ax + 9x 2 
(1 +2x)(l -3x) 2 



1 

■ + 



l+2x (l-3x) 2 

using partial fractions. (Check it.) 
From Equation (4.16), 

— l — = 1 + 3x + 3 2 x 2 + 3 V + • • • + 3"x" + ■■■ (4.24) 
1 — 3x 

What about 1/(1 — 3x) 2 ? The brute-force approach would be to square both sides of 
Equation (4.24), using Equations (4.19a)-(4.19b) for the right-hand side. But, there 

is an easier solution. Because l+x+x 2 +x 3 H converges to (1— for all 

xG (— 1,1), the right-hand side of Equation (4.24) converges to the left-hand side 
whenever 3xG (—1,1), i.e., for all i£ (— j,|). It follows from Theorem 4.2.9 that 
both sides of Equation (4.24) can be differentiated to obtain 



3 



so 



(1 - 3x)- 



{\-3xY 



= 3 + 2{3 2 )x + 3(3 V + • • • + n{3 n )x n - x + 



1 + 2(3)* + 3(3 2 )x 2 + • • • + n(3"- 1 )x"- 1 + • • • 
^(n+l)3V\ (4.25) 



«>o 



Adding Equations (4.15) and (4.25), we obtain 

g(x) = 2 + Ax + 3lx 2 + lOOx 5 H h a n x n + ■ 

1 1 



l + 2x (l-3x) 2 

= 2[(-2)- + (n+l)3V- (4-26) 

«>o 

Therefore, a n = (-2)" + (n+ 1)3", n > 0. □ 
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The technique that was used in Example 4.2.10 to pass from Equation (4.24) to 
Equation (4.25) has many uses. For example, successive differentiations of 



' 1 + x + x 2 H h x" H 



1 -x 



yield 



(1-x) 2 



(1-x) 3 



1 +2x + 3x 2 -\ h (n + l)x"-\ , 

2 + 2(3*) + 3(4x 2 ) + ••• + («+ l)(n + 2)x" + 



and so on, the formula for the rth derivative being 
H 

— r = P(r, r)+P{r+\, r)x + P(r + 2, r)x 2 + ■ ■ ■ + P(r + n, r)x" + ■ 



(i-*r 

Dividing both sides of this equation by r! yields 



— ^— r =^C(r + «,r)x", (4.27) 

(1-*) n>0 

the generating function for {C(r + n, r)}. When both sides of Equation (4.27) are 
multiplied by x r , the result is 



(l-^) r+ n>0 



T =J2c(r + n,r)x" +r 
«>o 

^C(n,r)x" 

n>r 

^C(n,r)x" (4.28) 



«>o 



because C(«, r) = 0 whenever « < r. Let's summarize. 

4.2.11 Theorem. Lef r be a fixed nonnegative integer. If a n — C(n, r), n > 0, 
a closed formula for the generating function of {a n } is g(x) = x r /(l — x) r+l . 

The rath term of the sequence {C(n, r)} is a value of the polynomial 

x« jc(jc- 1) ■■■(*- r + 1) 
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and a coefficient of the generating function g(x) = x r /(l — x) r+ . In other words, 
f(n) is a closed formula (solution) for the nth term of the sequence, while 
x r /(l — x) r+1 is a closed formula (freeze-dried version) for the generating function 
of the sequence. 

4.2.12 Example. Speaking of values vs. coefficients, consider the sequence 
{a n } given by a Q — 0 and a n+ \ =a n + 2n+ 1, n>0, the first few terms of which 
are 

0,1,4,9,16,25,... 

That's right, it's the familiar sequence of (perfect) squares. In particular, f(n)= n 2 
solves the sequence in the sense that its rath term is given by a„ =/(«). What about 
the generating function 

g (*) = 5>v? 

n>0 

As Yogi Berra once remarked, this looks like deja vu all over again: Because 
w 2 = C(n,l) + 2C(w,2), 

8(x) 



J2[C(n,l) + 2C(n,2)]x" 

n>0 

5^C(n, l)x" + 2^C(»,2)x" 



«>o 



n>0 



X X 2 

+ 2- 



"(1-x) 2 (1-x) 3 
x{\ +x) 

: (1-^) 3 



by Theorem 4.2.1 1. □ 

We conclude this section with a combinatorial proof of the Pythagorean theorem 
due to E. R. Scheinerman. 

4.2.13 Example. Let a, b, and c be the lengths of the sides of a triangle. Then 
the angle opposite side c is a right angle if and only if a 2 + b 2 ~c 2 . This statement 
of the Pythagorean theorem is equivalent to the identity sin 2 (x) + cos 2 (x) = 1, 



*A combinatorial proof of the Pythagorean theorem, Math. Mag. 68 (1995), 48^19. 
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0<x<jK. (See Exercise 23.) Recall from calculus that the Maclaurin series expan- 
sions for sine and cosine are 



3 5 7 

x J x J x' 



sin(x) = x --- + --_-+... = ^2s„- v 

XXX ^ ^ X 

(x) = 1 -2! + 4!-6! + '" = 4 C ' 1 «!' 



cos 



«>0 



where 



0 


if 


n 


= 2k, 


+1 


if 


n 


= 4k + 1, 


-1 


if 


n 


= 4k- 1, 


0 


if 


n 


= 2k+ 1, 


+1 


if 


n 


= 4k, 


-1 


if 


n 


= 4k + 2. 



It follows from Equations (4. 19a) -(4. 19b) that 



sin2 w = EE77^K 
U\h rl ( n - r V-J 

= ^2 I X C ("' r ) SrSn - r 
«>0 Vr=0 , 



X n 



Similarly, 



COS' 



t r[ {n-r)\ 



U" 



= J2[J2 c ^ r "> CrC "- 



n>0 \ r=0 



X n 



It remains to prove that 



C(n, r)(s r j„_ r + c r c„- r ) = 8„, 0 . 



r=0 



When n~0, the summation on the left-hand side is Sq + Cq = 0+1 = 1. If n is odd, 
then one of r and « — r is odd and the other is even, so s r s n - r = c r c n - r = 0, 0 < r < n. 
If n is positive and even, then (see Exercise 24) the summation on left-hand side 
becomes 



±J2(-l) r C(n,r)=0 



r=0 



by Lemma 1.5.8. 



□ 
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4.2. EXERCISES 

1 Find a closed formula for the generating function g(x) = X) n >o C(m, n)x n = 
C(m, 0) + C(m, l)x + C(m, 2)x 2 + ■ ■ •, where m is a fixed but arbitrary 
positive integer. 

2 Find a closed formula for the generating function g(x) — J2n>o when 

(a) a n = 1, n > 0. 

(b) ao = 0 and a n = \,n>\. 

(c) «o = «i = 0 and a„ = 1, n > 2. 

(d) a„ = (-l)", «>0. 

(e) a n = n + 1, n > 0. 

(f) a n ~ n, n > 0. 

(g) a„ = (-l)Vn>0. 

3 Find a closed formula for the generating function of the sequence 

(a) flo = 1, «i = 2, and a„ = 3a„_i + 2a„_2, « > 2. 

(b) £>o = 2, fei = 1, and = 2fo„_i — 3Z?„_2, n>2. 

(c) Co = 4, C\ = 13, and c„ = 2c„_i — c„_2> n > 2. 

4 Use a closed formula for the generating function of {«„} to express a n as an 
explicit function of n when 

(a) «o = 7, ai = 6, and a„ = a„_i + 6a„_2> n>2. 

(b) fo 0 = 0,bi = 1, and fe„ = 2fe„_ 1 + 15fc„_2, n > 2. 

(c) c 0 = 3, C\ = 6, and c„ = c„_i + 20c„_ 2 > « > 2. 

5 Let {a„} be the sequence defined by ao = a\ = 3, 02 = 29, and a„ = 3a„_i + 
10a„_2 — 24a„_3, « > 3. Use generating functions and partial fractions to 
derive the solution a n = 4" + (-3)" + 2". 

6 The Fibonacci sequence is defined by F Q = F\ = 1, and F n = F n _ x + F n _ 2 , 



(a) Use generating functions and partial fractions to derive the identity 



(b) Prove that F n = [C{n +1,1) + 5C(n +1,3)+ 5 2 C{n + 1, 5) + • • -]/2". 

(c) Prove that F„ is the integer closest to 



n > 2. 
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(d) According to some, the most visually pleasing shape for a rectangle is one 
in which the ratio of adjacent sides is cp = (1 + v / 5)/2. Compute, to two 
decimal places, the ratio F n+ \/F n , 1 < n < 9. Compare the results with 
the decimal expansion of (p. 

(e) Prove that lim,,^ F n+l /F n = (1 + v / 5)/2. 

7 Suppose {a n } is a sequence for which the formal power series ao + a\x+ 
ci2X 2 + ■ ■ ■ converges to a function g(x) in some open interval (— r, r). Show 
that a„ = g'"l(0)/w!, n > 0, where gl°l = g and gM is the nth derivative of g, 
« > 1. 

8 Find a formula for the sum 1 + 2 + 2 2 + • • • + 2 m ~ 1 of the first m numbers in 
the geometric sequence {2"}. 

9 Prove that I + i + i + ^+.-.-i 

10 Recall that a fc-part composition of n is a positive integer solution to 
x\ + %2 + • • • + Xk = n. For fixed positive integers k and m, denote by a n the 
number of fc-part compositions of n none of which is larger than m. Prove that 
the generating function for {a n } is g{x) = (x + x 1 + ■ ■ ■ + x m ) k . 

11 Let k = 4 and m = 3 in Exercise 10. 

(a) Evaluate g(x) = J2n>o i- e -> compute the coefficient a n of x" in 

(x + x 2 + x 3 ) 4 , n > 07 

(b) Confirm that there are exactly a 7 (the coefficient of x 1 from your answer to 
part (a)) four-part compositions of 7, none of which is larger than 3. 

(c) Confirm that there are exactly a% four-part compositions of 8, none of 
which is larger than 3. 

(d) Show that the number of four-part compositions of 9, none of which is 
larger than 3, is equal to the number of four-part compositions of 7, none 
of which is larger than 3. 

(e) Confirm that there are exactly a<) four-part compositions of 9, none of 
which is larger than 3. 

12 Prove that (x + x 2 + x 3 H )* = Y, n >\ C(n-l,k- \)x n 

(a) using Equations (4. 19a) -(4. 19b) and induction on k. 

(b) using the fact that the number of compositions of n having k parts is 
C(n-l,k-\). 

13 Consider the sequence {b n } defined by b n = (-2)" + (n + 1)3". Show that 

b Q = 2,h= 4, b 2 = 31, and b n+l = 4b n + 32>„_i - 18* B - 2 , n > 2. 

14 Prove that the coefficient of x" in (1 + x) m /(l - x) is 2 m for all n > m. (Hint: 
Lemma 4.2.7.) 

15 Let g(x) be the generating function for {«„}. Describe the sequence {b n } 
whose generating function is (1 — x)g(x). 
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16 Let g(x) — J2n>o a n xn be the generating function for {«„}. Solve for a„ if 

(a) g(x) = 1/(1 -x). 

(b) g{x)=x(x+\)/(\~x)\ 

(c) g(x) =x{x 2 + Ax+ 1)/(1 -x) 4 . 

17 Given that e* = E«>o x 7 n! ' il must be that e2t = E„>o( 2x )7 w! = 
E n >o 2V /" ! - 0n the" other hand, e 2x = (e*) 2 . Use Equations (4.19a)- 
(4.19b) to prove that (E„>o^/" ! ) 2 = £n>o 2 "*7» ! - 

18 If g(x) = (l-x)~\ then g"(x) = 2(1 - x)~ 3 = 2g(x) 3 . Use Equations 
(4. 19a) -(4. 19b) and formal term-by-term differentiation to confirm that 

g"(x) = 2g(x) 3 when g(x) = l+ x + x 2 +x 3 + -- - . 

19 For a fixed but arbitrary positive integer m, let g m (x) = £ n>1 S{ m > (Don't 
confuse g m {x) with x" = En>i ^( m > w )-^'"'-) 

(a) Show that e x g m (x) is the generating function for {n m /nl}, i.e., that 
e x g m (x)=E„>in m x"/n\. 

(b) Show that g m (x) = e _x E„>i « ffl ^7»!- 

(c) Use (the right-hand side of) the equation in part (b) to compute 5(4, n), 
1 < n < 5. 

(d) Describe the relationship between part (b) and Stirling's identity. 

(e) Give the generating function proof of Dobinski's formula for the Bell 
numbers B m = e~' En>i 

20 Find the reciprocal of g{x) = E«>o fl «-*" ^ 

(a) a„ = 2(3") + (-2)". (tfinf: Equation (4.17).) 

(b) do = 2, a\ = 3, and a„ +2 = 5a n+ i — 6a„, n > 0. 

(c) a„ is the binomial coefficient C(« + 3,3), so that g(x) = 1 + 4x+ 
\0x 2 + 20x 3 + 35x 4 + • • • . (Hint: Equation (4.27).) 

(d) a n is the Stirling number of the second kind, S(n + 3, 3), so that g(x) = 
1 + 6x + 25x 2 + 90x 3 + 30 lx 4 H . (Hint: The proof of Theorem 4.2.6.) 

21 Denote by K(n) the number of ways to choose n elements, with replacement, 
from the set A — {r, s, t), where order doesn't matter, but subject to the 
conditions that r can be chosen at most three times, s at most twice, and t at 
most once. Let g(x) be the generating function for {K(n)}, i.e., g(x) = 

J2n>0 K ( n ) J<n - Sh0W that g( X ) = (! +X + X 2 +X 3 )(1 +X + X 2 ) (1 +X). 

22 Let C(n) be the number of ways to choose n elements from the set {N, D, Q}, 
where order doesn't matter, but subject to the conditions that N can be chosen 
at most ten times, D at most five times, and Q at most twice. 

(a) Find a closed formula for the generating function for {C(n)}. 

(b) In how many different ways can you change a half-dollar coin using only 
Nickels, Dimes, and Quarters? 
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23 Prove the equivalence of the two statements of the Pythagorean theorem given 
in Example 4.2.13. 

24 Let s n and c n be the quantities related to sines and cosines defined in Example 
4.2.13. Prove that 

n n 

(a) ^2 C( n > r)(s r s n - r + c r c n - r ) y^(-l) r C(n, r) if n = 4k. 

n n 

(b) J2c{n,r)(s r s n - r + c r c„- r ) = - ^(-l) r C(«, r) if« = 4£ + 2. 

r=0 r=0 

25 Find a closed formula for the generating function g(x) of the sequence 
(a) {3"}. (b) {n 3 }. (c) {2n 3 + 3« 2 }. 

26 Prove the partial fraction decomposition 



C(n, 0) C(n, 1) C(n, 2) ± C(n, n) 



x(x + 1) • • • (x + n) x x+l x + 2 x + n 

27 Consider the sequence 0, | , 1 5 , 2 1 , 3 j , 4 g , . . . , denoting its nth term by a„ 
and its generating function by f(x) — J2n>o a n x "- 

(a) Assuming the pattern continues, find a closed formula for a n . 

(b) Without actually doing it, describe in words how Example 4.2.12 might 
be used to obtain a closed formula for f(x). 



4.3. APPLICATIONS OF GENERATING FUNCTIONS 

If we make the substitution m = r + 1 in Equation (4.27) and replace C(m — 1 + n, 
m — 1) with C(m + n — 1, n), the result is 

—^=Y^C{n + m-\,n)x n , (4.29) 

the generating function for the number of ways to choose n times from 
{1,2,..., m}, with replacement, if order doesn't matter. In fact, there is no need 
to appeal to Equation (4.27). If x" 1 ,x" 2 , . . . ,x" m are chosen from the m sets of 
parentheses on the right-hand side of 

(1 -x)-' n = (l+x + x 2 + +x + x 2 + +x + x 2 + ---), 

their product will be x n if and only if n\ + «2 + ■ ■ ■ + n m = n, i.e., the coefficient of 
x" in (1 — x)~ m is the number of nonnegative integer solutions to this equation, a 
number that we know to be C(n + m — 1, n). 
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Replacing x with — x in Equation (4.29) produces 

(l+x)-' n = 2(-l)"C(n + m- l,n)x", (4.30a) 

n>0 

a binomial-type theorem for negative exponents. With the proper definition of 
C(—m,n), we can make it look even more like the binomial theorem. 

4.3.1 Definition. Let n be a nonnegative integer. If u is any real number, the 
extended binomial coefficient C(«,0) = 1, and 

C(u,n) = ( U ) = m(m - 1) -" ( ;- [w - 1]) , „ > 0. 

4.3.2 Example. If m is a positive integer, then, taking u = —m, 

-m(-m - 1) • • • (-m - [n - 1]) 
C(-m, n) = 1 

_ (-l)"m(m+ 1) ■■■(m+ [n- 1]) 
~ nj 

= (-l)"C(m + «- l,n). □ 
In view of Example 4.3.2, Equation (4.30a) can be written 

(x+iy m = Y J C{-m,n)x n . (4.30b) 

«>o 

A hundred years before the American Revolution, Isaac Newton* extended this 
binomial-type theorem even further. 

4.3.3 Newton's Binomial Theorem. Let u be a real number. If \x\ < \y\, then 

(x + y) u = Y,C{u,n)x n y u - n . 

n>0 

The curious hypothesis \x\ < \y\ signals a change of perspective. Equations 
(4.30a)-(4.30b) concern generating functions involving formal power series. Theo- 
rem 4.3.3 is a statement about a function of two variables that involves substituting 
numbers for x and y. 



"The first proof of Newton's binomial theorem was published in 1812 by Gauss. 
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1 

8 
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1 

16 


4 




-5 






128 



Figure 4.3.1. Maclaurin coefficients for (x+ I) 1 ' 2 . 

Proof of Theorem 4.3.3. When u is a positive integer, the result is just the bino- 
mial theorem, which holds without restrictions on x and y. Otherwise, because 

(x + y) u = f(^+l 

it suffices to prove the result when y = 1. If f(x) = (x+ 1)", then (confirm it) 
/M(0)/«! = C(u,n), resulting in the Maclaurin series expansion 

(*+!)" = J2c{u,n)x n . (4.31) 



Because 



«>o 



\C(u,n+l)x"+ l \ \u-n\ 

hm : — — : = hm \x\ 

n^oc \C(u,n)X n \ n^ac n+l 



Equation (4.31) converges absolutely for all \x\ < 1 by the ratio test. ■ 

4.3.4 Example. Suppose f(x) — (x+ l) 1 ^ 2 . From computations summarized in 
Fig. 4.3.1, the Maclaurin series expansion for/(x) is 

(^l) 1/2 =l+^-^ + ^ 3 -4 / + - (4 ' 32) 



Comparing coefficients of x" in Equations (4.31) and (4.32), it must be the case, 
e.g., that C(j,4) = — Let's check it out. From Definition 4.3.1, 

1 f-l \f-3\ f-5\ 1 _ -15/16 _ -5 Q 



2 V 2 J V 2 J V 2 7 4! 24 128 
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Shifting from binomial coefficients to Stirling numbers of the second kind, let 
f r {x) be the generating function for {S(n, r)}, i.e., 



/,(*) = £s(n,r)x" 

n>Q 

= S{r, r)x r + S(r + 1, r)x r+l + S(r + 2, r)x r+1 - 
Then (from Fig. 2.1.2), 

/i(jc) =x + x 2 +x 3 +x 4 +x 5 +x 6 + ■■■ =j^- (4.33) 

f 2 {x) =x 2 + 3x 3 + 7x 4 + 15x 5 + 31x 6 + • • • (4.34) 
fi{x) = x 3 + 6x 4 + 25x 5 + 90x 6 + 30lx 7 + ■■■ 

and so on. If r > 1, then adding 

-rxf r (x) = -rS(r,r)x r+1 - rS(r + l,r)x r+2 

tof T (x) gives 

(1 - rx)f r (x) = S(r, r)x r + [S(r + 1, r) - rS(r, r)]x r+l 

+ [S(r + 2,r)-rS(r+l,r)}x r+z + --- 
= S{r-l,r- l)x r + S(r,r- l)x r+1 + S(r + l,r- l)x r+2 + ■ ■ ■ 

= x fr-l 

by the recurrence S(k + 1, r) = S(k, r — 1) + rS(k, r) (and the fact that S(r, r) = 
1 = S(r- \,r- 1)). So, for all r > 1, f r (x) = xf r -i/(l - rx). Together with 
Equation (4.33), this implies 

3 

MX) = l^Tx Mx) = (l-3x)(1^2x)(l^) ' (4 ' 36) 

and so on. Along with induction, these observations prove the following. 

4.3.5 Theorem. Let r be a fixed but arbitrary positive integer. Denote by f r (x) 
the generating function for the sequence {S(n, r)} of Stirling numbers of the second 
kind, i.e., f r (x) = J2 n >0 ^( M > Then 
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4.3.6 Example. From Equation (4.35), 

g5(n,2 K =^ ; (4.38) 



n>2 

Using partial fractions, 

1 2 



(l-2x)(l-x) {I -Ik) l-x 

= 2[1 + 2x + (2x) 2 + (2x) 3 + • • •] - [1 + x + x 2 + x 3 + ■ ■ ■} 

= J2( r+1 - i v ( 4 - 39 ) 

n>0 

= 1 + 3x + Ix 2 + 15x 3 + 31x 4 + • • • 
Multiplying by x 2 , we recover Equation (4.34): 

f 2 (x) =x 2 + 3x 3 + 7x 4 + 15x 5 + 31x 6 + • • • 

So far, so good. Now let's see what was overlooked in the rash to compute: 
Multiplying Equation (4.39) by x 2 yields 



n>2 



Y,S{n,2)x"=fi{x) 

= ^(2" +1 - l)x"+ 2 
«>o 



n>2 



Comparing the coefficient of x" on either side of this equation yields the closed for- 
mula S(n,2) = 2"- 1 - 1. 

Similarly, from Equation (4.36) (check the computations), 



£s(n,3)*" 



,[ 9/2 4 1/2 



, 1 — 3jc 1 — 2x 1 — x 

x 3 [f(l + 3x+(3x) 2 +---) - 4(l+2x+(2x) 2 + ---). 

+±(l+x + x 2 + •••)]• 



Therefore, 5(n,3) =i(3"" ] -2"+l). 
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What is the generalization? If partial fractions are used with Equation (4.37), the 
result is a generating function proof of Stirling's identity, 

S(n,r)=^(-iy-'C(r,t)f. n 

With the convention p(0) = 1, the partition generating function is 

P(*) = 5>(n)*" 
«>o 

= 1 + x + 2x 2 + 3x 3 + 5x 4 + Ix 5 + llx 6 + 15x 7 + 22x 8 + • • • 
There is no closed formula for P(x), but there is an interesting formula. 

4.3.7 Theorem. The partition generating function 

p w = IIrb- ( 4 - 4 °) 

Whoa! An infinite product? 

4.3.8 Example. The coefficient of x 4 in the infinite product 

[](1 +x k +x 2k + •••) = (1 +x + x 2 + • • -)(1 + [x 1 ] + [x 2 ] 2 +■■■) 

k>l 

x (1 + [x 3 ] + [x 3 ] 2 + ■ ■ -)(1 + [x 4 ] + [x 4 ] 2 + ...)••• 
is the same as the coefficient of x 4 in the finite product 

(1 +x + x 2 +x 3 +x 4 )(l + [x 2 ] + [x 2 ] 2 )(l + [x 3 ])(l + [x 4 ]). □ 

Proof of Theorem 4.3.7. Recall the shorthand notation for partitions, e.g., 

[4 3 ,3 5 ,2 4 ,1 6 ] = [4,4,4,3,3,3,3,3,2,2,2,2,1,1,1,1,1,1], 

where exponents denote the multiplicities of repeated parts. Thus, e.g., 4 3 contri- 
butes not 4 x 4 x 4 = 64, but 4 + 4 + 4=12 to the sum 4x3 + 3x5+ 
2x4+1x6 = 41. In particular, [4 3 , 3 5 , 2 4 , l 6 ] h 41. More generally, [...,4 r \ 
3 r3 ,2 r % l ri ] hnif and only if 

••• + 4r 4 + 3r 3 + 2r 2 + lr, = n. (4.41) 

The distributivity of multiplication over addition implies that a product of finitely 
many finite sums can be evaluated by choosing one term from each summand (set 
of parentheses), multiplying the choices together, adding the resulting products for 
all possible ways of making the selections, and "combining like terms". With the 
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[4] 


r 4 =l 


1 X 1 X 1 X [x 4 ] X 1 X • ■ • 


[3,1] 


r t = r 3 = 1 


X X 1 X [x 3 ] X 1 X 1 X • ■ • 


[2 2 ] 


r 2 = 2 


1 x [x 2 ] 2 X 1 X 1 X 1 X ■ • ■ 


[2,1 2 ] 


r i = 2, r 2 = 1 


X 2 XX 2 X 1 X 1 X 1 X • • • 


[I 4 ] 


r 1= 4 


x 4 xlxlxlxlx--- 



Figure 4.3.2 



added constraint that 1 must be the choice from all but finitely many summands, 
this process extends to evaluating the infinite product 

(l+x+x 2 +x 3 + ...)(l + [x 2 ] + [x 2 ] 2 + [x 2 ] 3 + ---)x(l + [x 3 ] + [x 3 ] 2 +[x 3 ] 3 + ---)--- 

There is a natural one-to-one correspondence between the different choices that 
produce x" in this process, and the distinct partitions of n. If 

x r ' is chosen from (1 + x + x 1 + x 3 + ■ ■ •), 
[x 2 ]' 2 = x 2 ' 2 from (1 + [x 2 ] + [x 2 ] 2 + [x 2 ] 3 + •••), 
[x 3 ] rs = x 3r3 from (1 + [x 3 ] + [x 3 ] 2 + [x 3 ] 3 + •••), 

and so on, then the product x n x x 2r2 x • • • x x nr " =x n , if and only if r\ + 2r^ + h 

nr n =n. By Equation (4.41), this is equivalent to [«'", . . . ,2 ri , V 1 ] hn. So, the 
coefficient of x" on the right-hand side of Equation (4.40) is exactly p(n). ■ 

In the proof of Theorem 4.3.7, the correspondence between partitions of n = 4, 
solutions of r\ + 2r 2 + 3r3 + 4^4 = 4, and selections yielding x 4 is tabulated in 
Fig. 4.3.2. 

4.3.9 Example. In how many ways can change be made for a dollar? Not 
counting a dollar coin as "change", the available coins are pennies, nickels, dimes, 
quarters, and half dollars. 
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The answer is the number of nonnegative integer solutions to the equation 

p + 5n + lOd + 25q + 50h = 100. 

This is a partition problem in which the parts are restricted to the values 1, 5, 10, 25, 
and 50. 

With no restrictions on the denominations of the coins, the answer be the coeffi- 
cient of x 100 in the infinite product 

Uil-x")- 1 = (l+x + x 2 + ---) x (l + [x 2 ] + [x 2 ] 2 + --.)x 

k>\ 

(1 + [x 3 ] + [x 3 ] 2 + • • •) x • • • 

With the restrictions imposed by U.S. coins, the answer involves just those contri- 
butions in which 1 is the mandatory choice from all summands but the 1st, 5th, 
10th, 25th, and 50th, i.e., the number of ways to change a dollar is the coefficient 
of x 100 in the product 

(1 + x + x 2 + • • -)(1 + [x 5 ] + [x 5 ] 2 + • • -)(1 + [x 10 ] + [x 10 ] 2 + • • •) x 

(1 + [x 25 ] + [x 25 ] 2 + • • -)(1 + [x 50 ] + [x 50 ] 2 + •••). 

Thus, e.g., the contribution 

1 x [x 5 ] 4 x [x 10 ] 3 x 1 x [x 50 ] - x 100 

corresponds to changing the dollar with four nickels, three dimes, and a half 
dollar; making change with four quarters corresponds to 1 x 1 x 1 x [x 25 ] 4 x 1, 
and so on. □ 

There are other interesting ways to restrict the parts of partitions. 

4.3.10 Example. The p(6) = ll partitions of 6 are [6], [5, 1], [4,2], [4, l 2 ], [3 2 ], 
[3,2,1],[3,1 3 ],[2 3 ],[2 2 ,1 2 ],[2,1 4 ], and [l 6 ]. Some of these expressions are compli- 
cated by exponents (indicating multiplicities). The simpler ones, [6], [5, 1], [4,2], 
and [3,2, 1], are those having distinct parts. Denote by /?dist( M ) the number of parti- 
tions of n, each of whose parts is different. Then, e.g, Pdist(6) =4. The generating 
function for {pdist(»)} is 

h(x) = ^Pdist(«)x" 

n>0 

= (l+x)(l+x 2 )(l+x 3 )(l+x 4 )---, (4.42) 



where, by convention, /?dist(0) = 1. 



□ 
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4.3.11 Example. Let /?<,dd(") be the number of partitions of n each of whose 
parts is odd. From Example 4.3.10, the odd-part partitions of 6 are 
[5, 1], [3 2 ], [3, l 3 ], and [l 6 ], so p 0 dd(6)=4. The generating function for {p 0 dd(«)} is 



«>o 

l \ / l \ / l 



(4.43) 



v i -xj \i-xy\i-x 5 ; 

where, by convention, p o dd(0) = 1. □ 

From Examples 4.3.10 and 4.3.11, /? 0 dd(6) = Pdist(6). This coincidence turns out 
not to be an accident. 

4.3.12 Theorem. Let p 0 dd{n) be the number of partitions of n each of whose 
parts is odd and pdist{n) the number having distinct parts. Then, for every positive 
integer n, p odd (n) = p dis ,(n). 

Proof. From Example 4.3.10, the generating function for /?dist( n ) is 
h(x) = (1 + jc)(1 +x 2 )(l +x 3 )(l +x 4 ) ■ ■ ■ 

l-x 2 \fl- \x 2 ] 2 \ ( 1 - \x 2 ] 2 \ ( 1 - U 4 1 2 \ 



1 — x J \ 1 — x 2 I \ 1 — x 3 J \ 1 — x 4 I 

After canceling 1 — x 2 , 1 — x 4 , and so on, i.e., every term from the numerator and 
every second term from the denominator, we are left with g(x) from Example 
4.3.11 on the right-hand side. ■ 

Let's return to the partition generating function 

P{x)=Y,P{n)x n 

«>o 

= 1 +x+ 2x 2 + 3x 3 + 5x 4 + Ix 5 + llx 6 + 15x 7 + 22x 8 + • • • 

= Iirh ( 4 - 44 ) 
k>\ 1 ■* 

Because the constant coefficient p(0) = 1^0 (by convention), the formal power 
series P(x) has a reciprocal, call it f(x) = J2n>o a nx". Then, as in the proof of The- 
orem 4.2.6, ao = l/p(0) = 1. Because 

0 = p(Q)a 1 + p(l)a Q (4.45a) 

= 1 x a\ + 1 x 1, 
a\ = -1; 
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since 



0 = p(0)a 2 +p{\)a\ + p(2)a 0 

= 1 x a 2 + 1 x (-1) + 2 x 1 
a 2 = -1; 



(4.45b) 



because 



0 = p(0)a 3 + p(l)a 2 +p(2)a\ + p(3)a 0 

= 1 x a 3 + 1 x (-1) +2 x (-1) + 3 x 1, 
a 3 = 0; 



(4.45c) 



and so on. But, this is the hard way to proceed. The easy way is to invert both sides 
of Equation (4.44), obtaining 



Judging from the first few terms, it appears that many coefficients of f(x) are 
zero and those that are not all seem to be ± 1 . After the first term, the signs seem to 
alternate in pairs. Within these pairs, the exponents appear to drift further apart, 
one unit at a time. Finally, the first exponent in each pair comes from the sequence 



Applying the techniques of Section 4. 1 to this fragment suggests that its nth term is 
given by the polynomial function C(n, 0) + 4C(n, 1) + 3C(«, 2) = j(2 + 5n + 
3m 2 ). (Confirm it!) If this formula is valid for all n, then the sequence is well 
known! It consists of the so-called pentagonal numbers (Fig. 4.3.3). 

Historically, the pentagonal number sequence is written so as to begin, not with a 
zeroth, but with a first term. This perspective can be accommodated by setting 
m = n + 1. Starting with m = 1, the mth term of the pentagonal number sequence 
is 



4.3.13 Euler's Pentagonal Number Theorem. The reciprocal of the partition 
generating function P{x) is 



f(x)^H(l-/) 



k>l 



= (1 — jc)(1 -x 2 )(l -x 3 )(l-x 4 )(l -x 5 )--- 

= l-*-* 2 +* 5 +* 7 -* 12 -* 15 +* 22 +* 26 -* 35 -... 



(4.46) 



1,5,12,22,35,... 



i(2 + 5[m - 1] + 3[m - l] 2 ) = m(3m - 1). 



«>0 



=ri( 1 -^) 



k>l 




(4.47) 



m>l 
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1 5 12 22 

Figure 4.3.3. Pentagonal Numbers. 



(Confirm that the first few terms of Equation (4.47) are precisely those given by 
Equation (4.46).) 

4.3.14 Example. Apart from historical footnotes, what good is Equation (4.47)? 
For one thing, an independent way to compute the coefficients of 1 /P(x) = 
f(x)=J2 n>0 a n x" gives us another way to look at P(x) = \/f(x). Let's reconsider 
the approach illustrated by Equations (4.45a)-(4.45c), but this time from "the 
reverse-angle". The coefficient, e.g., of x 9 in the product f(x)P(x) is 

0 = a Q p(9) + aip{S) + a 2 p(l) + ■■■+ a % p{\) + a 9 p(0). 

Substituting 00 = 05 = 07=1, a\=ai = —\, and a3 = a4 = a6 = fl8 = fl9 = 0 from 
Equation (4.46) [an explicit representation of Equation (4.47)], yields 

0 = p(9)-p(8)-p(7)+p(4)+/>(2). 

Upon substituting the values p(2)=2, p(4) = 5, p(l) = 15, and p{%) = 22 from 
Equation (4.44), this yields p(9) = 30. Similarly, 

0=/»(10)-p(9)-p(8)+p(5)+p(3). 
from which it follows that 

p(10) = 30 + 22-7 -3 
= 42. 

(Confirm these values by summing rows 9 and 10 of Figure 1.8.2.) □ 
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4.3. EXERCISES 

1 Evaluate the extended binomial coefficient 

(a) (-/). (b) (-/). (c) C(|,2). (d) C(-|,2). 

2 Show that 2n x CQ,n) = C(-\,n- l). 

3 Show that C(—u,n) — (— l)"C(« + n — l,n) for any real number m and any 
nonnegative integer w. 

4 Prove that (-l) m C(-n,m- 1) = (-l)"C(-m,n - 1). 

5 Prove that (-4)"C(- 5 , «) = C(2n, n). 

6 Prove Pascal's relation C(m + l,n) = C(u,n — 1) + C(u,n) for the extended 
binomial coefficients. 

7 Confirm that the formulas 5(m, 2) = 2™- 1 - 1 and S(m, 3) = ± (3™- 1 - 
2 m +1), obtained in Example 4.3.6, are the r = 2 and r = 3 cases, respec- 
tively, of Stirling's identity. 

8 Consider ft(x) = x 4 /[(l - x)(l - 2x)(l - 3x)(l - Ax)] from Equation (4.37). 

(a) Expand fi{x) using partial fractions. 

(b) Use your answer to part (a) to show that S(m, 4) = | [4 m_1 - 3 m + 
3(2«-i)_l]. 

(c) Use part (b) to compute 5(8,4). 

(d) Show that part (b) is the r — 4 case of Stirling's identity. 

9 Prove that the generating function for the Fibonacci numbers 

F(x) = 1 + x + 2x 2 + 3x 3 + 5x 4 + 8x 5 + 13x 6 + • • • 

has radius of convergence cp = (1 + v / 5)/2. 

10 In the manner of Example 4.3.4, show that the first few terms in the Maclaurin 
series expansion for f(x) = (1 — x) -1 ^ 2 are 1 + \x + \x 2 + j^x 3 + -f^x 4 + • • • 

11 For things to work out properly in Exercise 10, C(— \ , 4) had better be Use 
Definition 4.3.1 to confirm that it is. 

12 By Newton's binomial theorem, 

(1 + x) 1/2 = 1 + C(i , l)x + C(i , 2)x 2 + CQ , 3)x 3 + • • • 

Since the square of the left-hand side of this equation is 1 + x, the square of 
the right-hand side must be 1 + x. In particular, the coefficient of x" in the 
square of the right-hand side must be zero for all n > 2. From Equations 
(4.19a)-(4.19b), the coefficient, e.g., of x 2 is 2C(±,2) + C(±, l) 2 . 
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(a) Use Definition 4.3.1 to confirm that 2C(± , 2} + C(± , \f= 0. 

(b) Use Equations (4.19a)-(4.19b) to express the coefficient of x 3 in the square 
of the right-hand side; then use Definition 4.3.1 to confirm that it is zero. 

(c) Use Equations (4.19a)-(4.19b) to express the coefficient of x 4 in the square of 
the right-hand side; then use Definition 4.3.1 to confirm that it is zero. 

(d) Further confirm parts (a)-(c) by truncating the right-hand side of Equation 
(4.32) at the ellipsis (" • • • ") and squaring what's left. 

13 Show that (1 - 4x)~ 1/2 = 1 + 2x + 6x 2 + 20x 3 + 7(k 4 + • • • . 

— 1/2 

14 Show that (1 — 4-x) ' is the generating function for C(2n,«) by 

(a) using Newton's binomial theorem and Exercise 5. 

(b) showing that ao — 1 and a n+1 = (4-n + 2)a n /(n + 1), n > 0, in the 
Maclaurin series expansion (1 — 4x) -1 ^ 2 = X)«>o a « x "- 

15 Let g(x) be the generating function for the Catalan sequence {C(2n, n)/ 
(n + 1)}. Show that g(x) = (1 - Vl - Ax)/2x. 

16 Suppose A is a nonempty subset of positive integers. Let /?a(") be the number 
of partitions of n each of whose parts is an element of A. Find a closed for 
S{ x ) — J2 n >o PA(n)x" where Pa(0) is assumed to be 1. 

17 Find a closed formula for g(x) = ^2 n>0 a„x" when ao = 1 and a n is the number 
of partitions of n, 

(a) no part of which is repeated more than twice. 

(b) no part of which is repeated more than three times. 

18 For a fixed but arbitrary positive integer k, let 



and the number of partitions of n each of whose parts is at most k, n > 0. 

(a) Show that p(n; k) — p(n; k — 1) + p(n — k;k). 

(b) If gk(x) = J2n>QP( n '^) J<n i s tne generating function for {p(n;k)}, show 



(c) Show that Pk(n) — p(n; k) — p(n; k — 1), where pk(n) is the number of k- 
part partitions of n. 

(d) Let fk(n) = ^2 n> oPk(n)x" be the generating function for {p^n)}. Use 
parts (b) and (c) to show that 




that 



k 



gt (*)=n(i-*o 



i=l 



k 



Mx)=x k l[(i-x i y\ 
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(e) Find a closed formula for the generating function for the partitions of n 
each of whose parts is different and at most k. 

19 Let q m (n) be the number of partitions of n having m parts each of which is 
different (so that J2 m >i ^(n) = Aust( n ))- 

(a) Show that q 3 (l0) = 4. 

(b) Show that q 3 (l2) = 7. 

(c) Show that qjn) = qjn - m) + q m _ x (n-m). 

(d) Show that 1 + E„> m >! q m (n)x"t m = ft^l + fx'). 

(e) Prove that qjn) = p m (n — C(m, 2)), the number of m-part partitions of 
n — C(m, 2) (with no restrictions on the parts). 

20 Recall that p m (n) is the number of m-part partitions of n. Let 
fm{x) = Y, n >oPm(n)x" be the generating function for {p m (n)}. 

(a) Show that f m (x) is the coefficient of t m in 

p(x,t) = TT - 1 

y ' ' \ f 1 - tx> 

(b) Compute the mth partial derivative of P(x, t) with respect to t and use it to 
show that 

x m 

/mW = (1 -jc)(1 -x 2 )---(l -jc») • 

21 Prove that ^) m >i EJLi r ) xmfW = " 

22 In the manner of Example 4.3.14 (and using Equation (4.46)), 

(a) show that p(ll) = 56. 

(b) show that p(l2) = p(0) -p(5) -p(7) +p(10) +p(H). 

(c) evaluate p(12). 

23 Prove that 5(m + 1, n + 1) = + 1 )'" _r - 

24 Use Exercise 23 and Fig. 2.1.2 to show that 
(a) 5(8,5) = 1050. (b) 5(9,6) = 2646. 

25 Show that there are 292 ways to change a dollar. 

26 How many ways are there to change a 
(a) quarter? (b) half-dollar? 

27 Let b n be the number of nonnegative integer solutions to x\ + x-i + 
x 3 + x 4 = n. Find a closed formula for the generating function of the sequence 
{bn} if 

(a) Xi < 10, 1 < i < 4. 
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(b) xt is odd, 1 < i < 4. 

(c) 2 < x\ < 5, 7 < x 2 < 9, 4 < x 3 , and x 4 < 6. 

(d) there are no additional restrictions on the x,. 

28 Confirm Theorem 4.3.12 by showing that 

(a) (1+ x + x 2 + ■■■)(!+ x 3 +x 6 + ■■■)(!+ x 5 + ■ + x 7 + ■■■) = 
1 + x + x 2 + 2x 3 + 2x A + 3x 5 + 4x 6 + 5x 7 H 

(b) (l+x)(l+x 2 )(l+x 3 )(l+x 4 )(l+x 5 )(l+x 6 )(l+x 7 )--- = l+x + 
x 2 + 2x 3 + 2x 4 + 3x 5 + 4x 6 + 5x 7 H 

29 Let a n be the number of ways to distribute n unlabeled balls into eight labeled 
urns. Find a closed formula for the generating function of the sequence {«„} if 

(a) no urn is left empty. 

(b) no urn is left with fewer than three balls. 

30 Show that 

(a) in the partial fraction expression 

1 ai an a r 



(l-*)(l-2x)---(l-7x) l-x l-2x l-rx' 

a,= (-l) r -y-7[(f-l)!(r-f)!], 1 < t < r. 

(b) the Stirling number S(n, r) is the coefficient of x" in the expression 

f r (x) = X r ^2(-l) r -' {t _^ r _ t)] [1 + tx + {txf + (tx) 3 + •••]. 

(c) as advertised in Example 4.3.6, Equation (4.37) leads to a new proof of 
Stirling's identity. 

31 Let A: be a fixed but arbitrary positive integer. Denote by a k (n) the number of 
(equally likely) ways to obtain a sum of n by rolling k (fair) dice. 

(a) Find a closed formula for the generating function gk(x) — X)«>o 

(b) Show that g 4 (x) = (x- x 7 ) 4 J2 n >o c ( n + 3 , 3 K- 

(c) Show that a 4 (20) = C(19, 3) - 4C(13, 3) + 6C(7, 3). 

(d) Evaluate a 4 (20). 

(e) Evaluate o 4 (24). 

32 Given an integer k>2, show that the number of partitions of n, none of whose 
parts is (exactly) divisible by k, is equal to the number of partitions of n no part 
of which has multiplicity as large as k. 

33 Let ci\ , «2, ■ ■ • i «m be fixed but arbitrary real numbers, and Let 
E n = E„(a l ,a 2 , . . . ,a m ). Denote by e(x) (not to be confused with e x ) the 
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generating function for {E n } so that e(x) = Eq + E\x + E2X 2 + • • • . Prove 
that 

e(x) = (1 + a\x){\ + a 2 x) ■ • • (1 + a m x). 

34 Let ai,ci2,...,a m be fixed but arbitrary real numbers, and let M n = 
M n (a\, CI2, . . ■ , a m ) — a" + a'{+ ■ ■ ■ + aJJ, be their nth power sum. Define 

MW=^(-l)X+i/. 

«>o 

(a) Show that M(x) = J27=i a r{ 1 + a r x)' 1 . 

(b) Show that M{x) = ^=1 D x ln(l + a r x). 

(c) Show that M(x) = D x ln[n"Li(l + M] 

(d) Show that M(x) = D x ln(e(x)), where e(x) is not e x , but the generating 
function from Exercise 33. 

(e) Show that M(x) — e'(x)/e(x), where e(x) is the function from part (d). 

(f) Show that e\x) = M(x)e(x) is the generating function version of Newton's 
identities. 

35 Let a\,a.2, ■ ■ ■ ,a m be fixed but arbitrary numbers. Their wth homogeneous 
symmetric function H n — H n (a\,ci2, ■ ■ ■ ,a m ) is the sum of all C(n + m— l,n) 
monomials of degree n in a x , a 2 , . . . , a m , i.e., H„(a l ,a 27 . . . ,a m ) — 
^2M a (a\,a 2 , . . . ,a m ), where the summation is over all partitions a of n 
having at most m parts. Let h(x) be the generating function for {//„}, assuming 
that H 0 = 1, i.e., h(x) = l+H 1 x + H 2 x 2 H . 

(a) Show that h(x) = [(1 — a\x){\ — a2x) ■ ■ ■ (1 — a m x)]~ l . 

(b) Explain how/why part (a) is a generalization of Equation (4.29). 

(c) Prove that e(—x)h(x) = 1, where e(x) is the function in Exercise 33. 

(d) For every n > 1, prove that Y^=o(~ l) r ErH n -r = 0, where E r is defined in 
Exercise 33. 

(e) Confirm, by direct computation, that 

7/3(0, b, c) — E\ (a, b, c)H 2 (a, b, c) + £2(^1 b, c)H\ (a, b, c) — £3(0, b, c) = 0. 

(f) Prove that the elementary symmetric functions E n (x\,x 2 , . . . ,x m ), 1 < n < 
m, can be expressed as polynomials in the homogeneous symmetric 
functions H n {x\,x 2) . . . ,x m ), 1 < n < m. 

(g) Prove the following analog of the fundamental theorem of symmetric 
polynomials: Any polynomial symmetric in the variables x\ , X2, . ■ ■ , x m is a 
polynomial in the homogeneous symmetric functions H n (xi,x 2 , . . . ,x m ), 
1 < n < m. 

(h) Let H be the (n + 1) x (n + 1) matrix whose -entry is zero if j > i 
and Hi_j(x\,X2, ■ ■ ■ ,x m ) if j < i. Similarly, let E by the (n + l)-square 
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matrix whose (i'j')-entry is zero if j > i and (— 1)' J E^j(xi,x 2 , . . . ,x m ) 
otherwise. Prove that H l = E. 

36 Prove that S(n + r,r) = H n (l, 2, . . . , r). (Hint: Theorem 4.3.5 and Exercise 
35(a). Compare with Exercise 11(c), Section 2.1.) 

37 Prove that £" =0 (-l) r C(>2, r)C(2w - r - \,n - r) = 0, n > 1. 

38 Let Z„ = Z„(s\,S2, ■ ■ ■ ,s n ) be the cycle index polynomial for S„ discussed in 
Section 3.7. Let/(x) = J2 n>0 Z n x" ^ e me generating function for {Z„}. Using 
Theorem 3.7.8(a) and Exercise 35(a), MacMahon showed that f(x) = e w , 
where 



Let/(x) = e w , and confirm that/l"l(0)/«! = Z n (s l ,s 2 , ■ ■ ■ ,s n ) when 
(a) n = 0. (b) n = 1. (c) n = 2. 
(d) n = 3. (e) n = 4. 

39 Let r and s be fixed but arbitrary positive integers. Denote by a^ r ^{n) the 
number of partitions of n that have at most s parts each of which is at most r. 
Define a( rs )(0) = 1. Then (Exercise 27, Section 1.8), Xin>o fl (r,')(' ! ) = 
C(r + s, r). Denote by/( r j )(x) = J2n>o a {r,s) (n)x n the generating function for 
these numbers. 

(a) Show that f^ 2) (x) = 1 + x + 2x 2 + x 3 + x A . 

(b) Show that f {X2 ) (x) = l+x + 2x 2 + 2x 3 + 2x 4 + x 5 + x 6 . 

(c) The q-binomial coefficient is C q (r + s, r) =f( r , s )(q)- (From parts (a) 
and (b), e.g., C q (4, 2) = 1 + q + 2q 2 + q 3 + q 4 and C,(5, 2) = 1 + q + 
2q 2 + 2q 3 + 2q 4 + q 5 +q 6 .) Show that C q (r + s, 0) = 1 = C q {r + s, 
r + s). 

(d) Show that C q (r + s, r) = C q {r + s, s). (See part (c).) 

(e) Show that C q (r + s, r) = C q (r + s - 1, r) + q"C q (r + s - 1, r - 1). 

(f) Show that 



W = S\X + j s 2 x 2 + 5 sj,x 3 + j S4X 4 + ■ ■ ■ 



C q (r + s,s) = 



(1 -<?)(! -<7 2 )---(l-<f +? ) 



(1 



q)(l -?)...(!- qr) x (1 - q)(l - q 2 ) ■ ■ ■ (1 - q>) ' 



(g) Prove that 



/(r, S )W = 



(l-x)(l-x 2 )---(l-x r+s ) 



(1 -x)(l -x 2 )---(l x (1 -x){\ -x 2 )---(l -x')' 



(h) Use the formula from part (g) to confirm part (a). 

(i) Use the formula from part (g) to confirm part (b). 
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(j) Show that lim 9 ^i C q (m,r) = C(m,r), binomial coefficient m-choose-r. 

(k) Denote by W(r, s) the set of binary words of length r + s, with r bits 
(letters) equal to 0 and s bits equal to 1. Suppose w = b\b2 ■ ■ ■ b m G W (so 
that m = r + s). As in Section 1.8, Exercise 27(d), the inversion number of 
bi is 0 if bj ; = 1, and it is the number of l's to the left of £>, if £>, = 0. Define 
Inv (w), the inversion number of w, to be the sum of the inversion numbers 
of its bits, and show that 

]T <? InvW =C q (r + s,r). 

weW(r,s) 

40 Denote by K(n) the number of ways to choose n elements from the set 
A = {r, s, t}, with replacement, where order doesn't matter, but subject to the 
conditions that r can be chosen at most three times, s at most twice, and t at 
most once. Then K(n) is the number of M-element sabmultisets of A subject to 
the multiplicity conditions on r, s, and t. When n = 5, e.g., the possible 
submultisets are {r, r, r, s, s}, {r, r, r, s, t}, and {r, r, s, s, t}, so that K(5) = 3. 
Letting 

«>0 

be the generating function for {K(n)}, show that 

(a) g(x) = (1 +x + x 2 + x 3 )(l +x + x 2 )(l +x). 

(b) g(x) = 1 + 3x + 5x 2 + 6x 3 + 5x 4 + 3x 5 + x 6 . 

(c) g{x) = [(1 - x 4 )(l - x 3 )(l - x 2 )]/(l - x) 3 . (Compare with Equation 
(4.29).) 



4.4. EXPONENTIAL GENERATING FUNCTIONS 

Form ever follows function. 

— Louis Henri Sullivan 
Recall that the Bell numbers are sums of Stirling numbers of the second kind; 

n 

B n = Y,S{n,r) 

r=l 

is the (total) number of partitions of {1, 2, . . . , n}. Setting Bo = 1, the Bell numbers 
satisfy the recurrence 

n 

B n+X = C{n, r)B r . (4.48) 

c=0 
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Let's see if we can find a closed formula for the generating function 



n>0 



1 + x + 2x 2 + 5x 3 + 15x 4 + 52x 5 H 



While it is true that 



B n = cifi„_i + c 2 5„- 2 H h c„B 0 



neither the coefficient c r — C(n — l,n — r) nor the number of terms is independent 
of n. Equation (4.48) is not a homogeneous linear recurrence as defined by Equation 
(4.21). Partial fractions are of no use here. A new idea is needed. 

Let's see what happens if we multiply by e K , the generating function for {1/n!}: 



This is the point at which we might expect Equation (4.48) to be helpful. And, it 
would be, if there were just an r\ in the denominator of Equation (4.49). (We were 
able to multiply and divide by «!, and then move 1/nl outside the parentheses, 
because nl is independent of the index of summation r. The same approach clearly 
will not work for r\.) 

Playing by the usual rules, there appears to be no way to solve the problem of the 
missing r\. So, let's change the rales. If we can't find a closed formula for 
g(x) = ^2 n>0 B n x n , let's instead consider 



The effect of repeating the same steps with this new formal power series is to 
replace B r in Equation (4.49) with B r /r\. With the new g(x), we obtain 




(4.49) 
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«>0 

= D x g(x), (4.50) 

the formal (term-by-term) derivative of g(x).* 

Assuming that the revised power series has a positive radius of convergence, we 
may treat D x g(x) as the ordinary derivative g'(x). In this case, dividing both sides of 
Equation (4.50) by g(x) and antidifferentiating, we obtain 



/ ^-dx= [ e*dx. 
J g(x) J 



8(x) 

It follows that ln(g(x)) = e x + C. Substituting g(0) = 1 gives ln(l) = e° + C, or 
0 = 1 + C. Hence, \n(g(x)) = e* — 1. Exponentiating both sides gives 

g(x) = exp(^ - 1) 
= f-\ 

4.4.1 Definition. The exponential generating function for the sequence {a n } is 

g(x) = y^a„x n /n\. 

n>0 

Evidently, the exponential generating function for {a n } is the ordinary genera- 
ting function for {a n /n\}. If m is a fixed but arbitrary positive integer then, e.g., 

(1 + x) m = C{m, 0) + C(m, \)x + C(m, 2)x 2 + ■■■ 

is the ordinary generating function for {C(m, n)} and, since 

s P(m, n) 



n>0 n>0 

(1 + x) m is the exponential generating function for {P(m, n)}. 

4.4.2 Theorem. The exponential generating function for the sequence {B n } of 
Bell numbers is exp{e x — 1). 



*To those familiar with the use of integrating factors in differential equations, this may make the decision 
to multiply by e* a little less mysterious. 
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Our derivation of Theorem 4.4.2 falls short of a proof because it relies on the 
assumption that g(x) = ^2 n >o(B„/n\)x" nas a positive radius of convergence. If 
we knew a lot more about the Bell numbers, we might be able to prove this fact 
using one of the familiar tests from calculus. (Having n\ in the denominator can 
do no harm whenever convergence is an issue.) 

Alternatively, we know from calculus that the Maclaurin series for exp(x) con- 
verges for all x. Thus, another way to prove Theorem 4.4.2 would be to show that 
B n /n\ is the coefficient of x" in the Maclaurin series expansion of exp(e c — 1). This 
is the approach taken in Exercise 5. 

Had we known to look for an exponential generating function from the begin- 
ning, the clever but mysterious "let's see what happens if we multiply by e x " would 
have been unnecessary. Multiplying both sides of Equation (4.48) by x"/n\ and 
summing on n yield 

E#k+1 „ _ 1 B r \ n 

n \ z^\z^ ( n - r )\ r \ r ■ 

«>0 «>0\r=0^ r )- r -J 

By Equations (4.19a)-(4.19b), this is equivalent to 

i.e., to D x g(x) = e x g(x), bringing us to Equation (4.50) by a more direct route. 

Why introduce a new kind of generating function? Because it makes our work 
easier. At first blush, this might seem strange. After all, there is nothing particularly 
easy about deriving the closed formula exp(e x — 1), nor is this formula especially 
simple. On the other hand, suppose your job depended on being able to find a closed 
formula for some generating function for {B n }. If you think it would be easier to 
solve the ordinary generating function problem, by all means go for it! 

4.4.3 Example. Of what use is the formula exp(e r — 1)? Observe that 

J2(B n /n\)x n = e eX - 1 

n>0 




= (1 + [x l /l\] + [x'/l!] 2 /2! + [*7l!] 3 /3! + • • •) 
x (1 + [x 2 /2!] + [x 2 /2!] 2 /2! + [* 2 /2!] 3 /3! + • • •) 
x (1 + [x 3 /3!] + [x 3 /3!] 2 /2! + [x 3 /3!] 3 /3! + • • •) 
x • • • 
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Comparing the coefficient of x" on either side of this equation, we obtain (after mul- 
tiplying by n\) that 



B„ = 



lfi+2fe+-+*ft= 



JV.) h h\ x (2!)' 2 r 2 !x ...x 



(4.51) 



This is interesting for many reasons, not the least of which is that the left-hand side 
pertains to the number of partitions (into disjoint subsets) of {1,2, ...,n}. Because 

If i + 2?2 H h At* = n if and only if [A:' 1 , • • • , 2 h , 1 ] h n, the right-hand side involves 

partitions of (the integer) w. 
The partitions of 4 are 



[4] corresponding to t\ 


= 0, f 2 


-0, f 3 


= 0, and ?4 


= 1; 


[3,1] corresponding to ?i 


= 1, H 


= 0, h 


= 1 , and ?4 


= 0; 


[2 2 ] corresponding to t\ 


= 0, f 2 


= 2, f 3 


= 0, and ?4 


= 0; 


[2,1 2 ] corresponding to ?i 


= 2, t 2 


= 1, t 3 


= 0, and ?4 


= 0; 


[l 4 ] corresponding to t\ 


= 4, f 2 


= 0, f 3 


= 0, and ?4 


= 0. 



Substituting these values into Equation (4.51), we obtain 
4! 4! 4! 4! 

B 4 



4! 



(4!)'l! (1!) 1 1!(3!) 1 1! (2!) 2 2! (1!) 2 2!(2!) 1 1! (1!) 4 4! 
1+4 + 3 + 6+1 = 15. 



□ 



4.4.4 Example. Without recognizing them as such, we have already seen many 
examples of exponential generating functions. Consider, e.g., the sequence {c„} 
defined by 

!0 ifn = 2A+l, 
+ 1 if n = 4k, 
-1 ifw = 4A + 2, 

the first few terms of which are 1,0,-1,0,1,0,-1, ... . The exponential generating 
function for {c„}, 

x 2 x 4 x 6 
1_ 2! + 4!~6! + "' = COs(x) ' 

can be found in Example 4.2.13. What about the sequence {d n } defined by 
d„ = |c„|? Its exponential generating function is 

x 2 x 4 x 6 e x + e~ x , , , 

1 + 2! + 4! + 6! + '" = ^^ =COSh(x) ' 



the hyperbolic cosine. 
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If m is fixed, the ordinary generating function for {w/ 1 } is X/nX)" 1 "*' 1 — 
J2 n >o( mx ) n = (1 — mx ) 1 an d its exponential generating function is ^H>o m " x ' ! / 
n\ = J2n>o( mx )" I n ' = e "" = exp(mx). What about {n m }l According to Exercise 
19, Section 4.2, 

m 

n m x n /n! = e x ^ S(m, n)x" . (4.52) 

«>0 K=l 

By Newton's binomial theorem (with |x| <\, 

(1 - 2xy 3/2 = 1/0! + 3x/l! + (3 x 5)x 2 /2l + (3 x 5 x 7)x 3 /3! + • • • 

is the exponential generating function for the sequence {«„}, where a n = 1 x 3x 
5 x ... x (2n+ 1) is the product of the first « + 1 odd integers. (Confirm it!) □ 

lff(x) and g(x) are exponential generating functions for {a n } and {£>„}, respec- 
tively, then 

f(x)g(x) = y^a^InX ] ( ^> n *7«! ] 

\n>0 / \«>0 / 

= W^C(n,r)aA-r]*7«! 
the exponential generating function for the sequence {c„} defined by 

n 

C " = X] C ( W ' r ) a r h n-r- (4.53) 
r=0 

(Compare and contrast Equations (4.19b) and (4.53).) 

If a„ = 1 for all n, then/(x) = e x and, after a change of variable, the right-hand 
side of Equation (4.53) becomes 

n n 

^C{n,n-r)b r = ^C{n,r)b r . (4.54) 

r=0 r=0 



Comparing the right-hand sides of Equations (4.48) and (4.54) should strip 
away any remaining mystery about the curious decision to "see that happens if 
we multiply by e x ." 
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Recall that a derangement is a permutation with no fixed points. From Equation 
(2.18) in Section 2.3, 

D(n) _ 1 1 ]__]_ (-1)" 
~^T~0!~l! + 2!~3! + '" + n\ ' 

Therefore, the (exponential) generating function 

«>o 



r 1 

n>0 \r=0 



It follows, from either Lemma 4.2.7 or Equations (4.19a)-(4.19b), that 

*w = fa- 1 )"*) (fa) 

\«>o "7 \«>0 / 

= «-*(l 

Let's summarize. 

4.4.5 Theorem. 77?<? exponential generating function for the derangement 
numbers is 

X>W»! = — . (4.55) 



«>o 



Speaking of fixed points and derangements leads to cycle structure and Stirling 
numbers of the first kind. Let k be a fixed but arbitrary positive integer and define 

h k {x) = ^ j s(n,k)x n /n\, (4.56) 

n>k 

the exponential generating function for {s(n,k)}. 
Recalling that s(n, 1) = (n — 1)!, 

=x + x 2 /2 + x 3 /3 + --- (4.57) 



The right-hand side of Equation (4.57) is the Maclaurin series expansion for 
-ln(l -x). 
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If k > 1, the derivative of Equation (4.56) is 



+ i,^) 

n>/fc-l 



a(ra+ l,fc) 



Ei(n, A: — 1) + ns(n, k) 

n>k-l 

= y ^^-'Wy^/ 

= /it-i(x) +xD x h k (x). 

So, (1 — x)D x h k (x) — h k _x{x). Assuming a positive radius of convergence for 
Equation (4.56), 

A t (jt) = f H ±-^dx. (4.58) 



4.4.6 Theorem. Let k be a fixed positive integer. Then the exponential generat- 
ing function for {s(n, k)}, the sequence of Stirling numbers of the first kind, is 



Y,s(n,k V /n>J^pt. 

n>k 



Proof. The k = 1 case follows from Equation (4.57) and the Maclaurin series 
expansion of — ln(l — x). Because this expansion has a positive radius of conver- 
gence, namely r = 1, Theorem 4.2.9 can be applied to the k = 2 case of Equation 
(4.58) to obtain 

h 2 (x) = J ^^dx 
= i[-ln(l- x )]2 

(where the constant of integration is s(0, 2) = 0). Moreover, also from Theorem 
4.2.9, h2(x) has radius of convergence r=\. The general formula follows from 
Equation (4.58) using induction on k (and integration by substitution). ■ 
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What about Stirling numbers of the second kind? Recall that, apart from some 
minus signs, the matrix manifestations of the two arrays of Stirling numbers are 
inverses of each other. Given the appearance of natural logarithms in Theorem 
4.4.6, it is natural to wonder whether the inverse of the logarithm function will 
emerge in a discussion of 

g r {x) = Y J S{n,r)x n /n\. 

«>0 

Let's see. 

By Stirling's identity (Corollary 2.2.4), 

n>0 \ 1=0 ) 

SO 

r\g r (x) = J2(-iy +t C(r, t )J2( tX y/n\ 

t=o «>o 

= J2(-iy + <C(r,t)e'* 
t=o 

t=o 

= (e*-iy. 

Therefore, 

g r (x) = {<?- \) r lr\. 
We have proved the following: 

4.4.7 Theorem. Let r be a fixed positive integer. The exponential generating 
function for the sequence {S(n, r)} of Stirling numbers of the second kind is 

Y J S{n,r)x n /n\ = {e x -\) r /r\. (4.59) 



4.4.8 Example. If the truth be known, it is a rare sequence for which even 
one variety of generating function has a nice closed formula. When {«„} has closed 
formula generating functions of more than one kind, they tend to be very different. 
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Recall that the ordinary generating function for {m n } is (1 — mi) -1 and its expo- 
nential generating function is e mx = exp(mx). The ordinary generating function for 

{S(n,r)}ia 

r 



(Theorem 4.3.5), while its exponential generating function is (e x — l)Y r " □ 

Because the Bell numbers are sums of Stirling numbers of the second kind, The- 
orem 4.4.7 should yield another proof of Theorem 4.4.2. Setting 5(0, 0) = 1 and 
summing both sides of Equation (4.59) on r, we obtain 



££s(n,r)*»/n! = 5>*-l)7r! 



r>0 n>0 r>0 



So, 



i.e., 



]T(]T 5(«, r))x"/n\ = exp(e* - 1), (4.60) 



n>0 r>0 



J2B n x n /n\ = exp(e x - 1). 



n>0 

Asked to find a generating function for the Bell numbers, we found an "expo- 
nential" generating function instead. There is, of course, nothing particularly sacred 
about the sequence 

Not only does 

1 x x 2 x 3 
0!'1!'2!'3! '"' 

work just as well, it enhances the likelihood of convergence. It is natural to wonder 
if other sequences might yield interesting results. For example, what about basing a 
generating function on 

1*,2*,3*,4V--? 

4.4.9 Definition. The Dirichlet* generating function of {a n } is the formal series 



'After Peter Gustav Lejeune Dirichlet (1805-1859). 
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There are several things to note right away about this definition. First, the vari- 
able has changed from x to s. This is an inconsequential change, having more to do 
with tradition than mathematics. The second is that we have used, not n s , but n~ s . 
This is a consequence of some experience; it turns out to be more useful. Finally, 
the summation starts with n = 1, which is necessary to avoid dividing by zero. 



4.4.10 Example. Let {a n } be the trivial sequence defined by a n = 1 for all n. Its 
Dirichlet generating function is the Riemann zeta function, 

n>\ 

□ 

The Dirichlet generating function analogue of Equations (4.19b) and (4.53) may 
help to suggest why they are important in number theory. 

4.4.11 Theorem. Let f(s) and g(s) be Dirichlet generating functions for the 
sequences {a n } and {b„}, respectively. Then f(s)g(s) is the Dirichlet generating 
function for {c„}, where 

c„ = a k b m , 

km—n 

the sum over all (ordered) factorizations n = km. 



Proof 

= (aib\) + (ai&2 + fl2^i)2 _,? + (fli^3 + aib\)3~ s + (a\b\ + a-ibi + a4bi)4 
+ (a\bs + a5Z7i)5 _! + [a x b 6 + a 2 bi + a^b 2 + a6^i)6 _! H 

In general, 

a k b m _ a k b m 
k s m s n s 

if and only if km = n, i.e., a k b m is a summand in the coefficient of n~ s if and only if 
km = n. ■ 

'Named after Georg Friedrich Bernhard Riemann (1826-1866). See, e.g., G. H. Hardy and E. M. Wright, 
An Introduction to the Theory of Numbers, 5th ed., Clarendon Press, Oxford, 1979, for a discussion of the 
zeta function. 
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4.4.72 Definition. Read "d divides «", the notation "d\n" means that there 
exists an integer q such that n — dq, i.e., that d exactly divides n, or that n is a multi- 
ple of d. 

Using this notation, the conclusion of Theorem 4.4. 1 1 can be written as 

c„ = ^2a d b n/d , (4.62) 

d\n 

from which it follows, e.g., that 




«>i 



the Dirichlet generating function for the sequence {d(n)}, where d(ri) is the number 
of (exact, positive-integer) divisors of n. 

4.4.13 Example. If n = p\ l p£ • • • p"', where p\,p2, ■ ■ ■ ,p r are different primes, 
then (Example 1.1.4) 

d(n) = (en + l)(a 2 + 1) ■ ■ ■ (a r + 1) 

= d(pr)d(p?)---d(p a ;). (4.63) 

If m and n are relatively prime, then no prime divisor of m is a prime divisor 
of n, and vice versa. It follows from Equation (4.63), therefore, that d(mn) = 
d(m)d(n). □ 

4.4.14 Definition. A number-theoretic function is one whose domain is the set 
of positive integers. A number-theoretic function / is multiplicative if f(mn) = 
f(m)f(n) whenever m and n are relatively prime. 

If n is a positive integer and 0 ^ f is a multiplicative number-theoretic function, 
then/(n) =/(l x n) =/(l )/(«), from which it follows that/(l) = 1. 

4.4.15 Lemma. Let f be a multiplicative number-theoretic function. If 
n = p" l P2 2 ■ ■ ■ p a r r , where P\,P2, ■ ■ ■ ,p r are different primes, then 

f{n)=m)m)---f{p a r <). (4.64) 

Conversely, a (numerical valued) function f defined arbitrarily on positive-integer 
powers of primes, can be extended to a multiplicative number-theoretic function by 
defining f{\) = 1 andf(n) by Equation (4.64) for all composite positive n. 

The proof is left to the exercises. 
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If /is any multiplicative number-theoretic function, then the Dirichlet generating 
function for the sequence defined by a„ = /(«) , n > 1, can be expressed in an inter- 
esting way. 



4.4.16 Theorem. If f is a multiplicative number-theoretic function, then 

E"^? = ITi 1 +f(p)p" +f(p 2 )p~ 2s +f(p 3 )p~ 3s + ■■■]. 
«>i n P 

where the product is over the (positive) prime numbers p. 

Proof. Consider a generic positive integer n — 2 a 3 h 5 c .... In the product 



1 



/(2) /(2 2 ) 
2 s 2 2s 



1 



/(3) /(3 2 ) 
3 s 3 2s 



1 



/(5) /(5 2 ) 
5 s 5 2s 



only the first set of brackets contains terms with 2's in the denominator, and only 
one of these denominators is 2 as . Thus, for a product comprised of one term from 
each set of brackets to have a denominator equal to n s , the unique choice from the 
first set must be/(2 a )/2 a5 . Similarly, the unique choice from the second set of 
brackets must be f(3 b )/3 bs , the unique choice from the third set must be 
/(5 e )/5 cs , and so on. In particular, n~ s is produced only once on the right-hand 
side of Equation (4.65), and its coefficient is 

f(2 a )f(3 h )f(5 c )---=f(2 a 3 h 5 c ---) 
= /(») 

because / is multiplicative. ■ 

4.4.17 Example. Suppose f(n) — 1 for all n. Then, because /is a multiplicative 
number-theoretic function, we can use Theorem 4.4.16 to obtain an identity for its 
Dirichlet generating function ((s). Evidently, the Riemann zeta function 



c(s) = H(i+p- s +p- 2s +p- 3s +---) 
p 

n i p -s 



□ 



4.4.18 Example. The multiplicative number-theoretic Mobius function* u is 
defined by 

f +1 if a = 0, 
\i(p a ) = { -1 if a=\, 
{ 0 if a > 2 



'Named for August Ferdinand Mobius (1790-1868). 
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n 1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


\i(n) 1 


-1 


-1 


0 


-1 


1 


-1 


0 


0 


1 


-1 


0 



Figure 4.4.1. The Mobius function. 



when n = p" is a power of a prime and (using Lemma 4.4.15) by 

H(n) = ii(p a Mp?)--- V L(p a ;) 

when n = pfp^ 2 ■ ■ ■ p" r is composite. So, (0.(1) = 1, |x(n) = (— 1)* if the prime fac- 
torization of n consists of k distinct primes, and (x(n) = 0 whenever n is (exactly) 
divisible by the square of a prime. The first few values of u are listed in Fig. 4.4. 1 . 

Denote by M(s) the Dirichlet generating function for the Mobius sequence, 
defined by a n = \i(n),n > 1. Then, by Theorem 4.4.16 and Example 4.4.17, 



«>i p ^ w i— i 



4.4.19 Corollary. If sequences {a n } and {b n } satisfy 



then 



a n = J2 b d, n>l, (4.67) 



b n = y2 ]i(n/d)a d . (4.68) 



Proof. Let A(s) and B(j) be the Dirichlet generating functions for {a n } and {£>„}, 
respectively. Then, by Equations (4.62) and (4.67), A(s) = B(s)((s). So, by 
Equation (4.66), A(s)M(s) = B(s). Equation (4.68) now follows from another 
application of Equation (4.62). ■ 

The transformation from Equation (4.67) to Equation (4.68) is commonly 
referred to as Mobius inversion. 

4.4.20 Example. Let s(n) = J2d\n d> me sum of the divisors of n. (Then, e.g., n is 
perfect if and only if s(n) = 2n.) It follows from Mobius inversion that 



n = ^ \i{n/d)s(d). 

d\n 



(4.69) 
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Let's confirm Equation (4.69), e.g., for n — 6 (the first perfect number): 

J] n(6/d>(«/) - n(6>(l) + u.(3>(2) + u.(2>(3) + n(l>(6) 

= s(1)-*(2)-j(3)+j(6) 

= 1 - (1 + 2) - (1 + 3) + (1 + 2 + 3 + 6) 

= 6. 

What about another example, maybe n = 8. (Because 5(8) = 1+2 + 4+ 8 = 
15 < 2 x 8, 8 is deficient.) Observe that 

53 Ji(8/d>(d) = u(8>(l) + u.(4>(2) + u.(2>(4) + n(l>(8) 

= 0 x s(l) + 0 x s(2) - s(4) + s(8) 
= -(1+2 + 4) + (1+2 + 4+ 8) 



4.4. EXERCISES 

1 Let g(x) be the exponential generating function for the sequence {a n }. Find a 
closed formula for a„ if 

(a) g(jc) = xe 2x . (b) g(x) = + e 3x . 

(c) g(x) = e*(x + x 2 ). (d) g(x) = e K {x + 3x 2 + x 3 ). 

2 Find a closed formula for the exponential generating function of the sequence 

(a) {«}. 

(b) Ki- 
te) 0,1,0,-1,0,1,0,-1,... 
(d) 0,1,0,1,0,1,0,1,... 

3 The purpose of this exercise is to outline another approach to the proof of 
Theorem 4.4.7. Let r be a fixed but arbitrary positive integer, and let 
8r( x ) — J2n>r r)x n /n\ be the exponential generating function for {S(n, r)}. 

(a) Show that gi(x) = e x - 1. 

(b) Show that the derivative D x q r (x) — rg r (x) + g r _\(x), r > 1. 

(c) Define f r (x) = (e* - l) r /r\ and show that/i(x) = e x - 1. 

(d) Show that DJ r (x) = rf r (x) +f r -i(x), r > 1. 

(e) Show that g r (0) =/ r (0). 
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(f) Prove Theorem 4.4.7. 

(g) Confirm directly that S(n, r) =//"'(0), the wth derivative of f r {x) evaluated 
at x = 0. 

4 Use Equation (4.51) to confirm that 

(a) B 3 = 5. 

(b) B 5 = 52. 

5 Let f(x) = exp(e- t - 1). Show that 

(a) f'(x) = exp(e l + x- 1). 

(b) f"(x) = exp(e c + 2x-l) + exp(e c +x-\). 

(c) f"'{x) = exp(e r + 3x - 1) + 3 exp(e x + 2x-l) + exp(e x + x - 1). 

(d) /' 4 l(x) - Eti S(4, r) exp(e* + rx - 1). 

(e) fr\x) = YTr=i S(n, r) exp( e * + rx - 1). 

(f) /N(0)=fl B . 

(g) /W = E„>o B ^/" ! - 

6 Recall that the Maclaurin series expansion of ln(l — x) is 

(a) Find a closed formula for the exponential generating function of the 
sequence defined by a n = (n — 1)!, n > 1. 

(b) Find a closed formula for the exponential generating function of the 
sequence defined by a n — (— 1)"(« — 1)!, n > 1. 

(c) Show that 

x = (e x - 1) - \{f - l) 2 + \{f - if - | {? - l) 4 + • • • 

7 Use Exercise 6(c) to prove that 

0\S(n, 1) - l!S(n,2) +2!5(n,3) - 3!S(«,4) + ■ • • + (-l)" _1 (n - l)\S{n,n) = 0 
for all n > 2. 

8 If f(x) = (1 — x)~ l e~ x then, by Theorem 4.4.5 and the general theory of 
Maclaurin series, /M(0) =D(n),n > 0. Compute and confirm that 

/W(0) =D(n),0 < n < 4. 

9 Let /?2(x) be the exponential generating function for {s(n,2)}. 

(a) Show that D x h 2 (x) = ~ ln ( 1 

1 — x 

(b) Show that D x h 2 {x) = s ( n + l,2)x"/n!. 
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(c) Use parts (a) and (b) to obtain a new derivation of the closed formula for the 
(ordinary) generating function of the harmonic numbers given in Example 
4.2.8. 

10 Give the (exponential) generating function proof that 

n 

Y^{-\YC{n,r) = 0, n > 1. {Hint : e x x e~ x = 1.) 

r=0 

11 Give the generating function proof that 

(a) E"=o(- 1 )'' c (' M + r ~ 1. r)C(m,n - r) = 0,n > 1. 

(b) YLrC{k,r) = C{n+l,r+\). 

(c) D(n) =nD{n- 1) + (-1)". 

12 Explain why (x + l) m (x+ 1)" = (x + i s the generating function proof 
of Vandermonde's identity (Exercise 15, Section 1.5). 

13 Perhaps the most curious thing to occur in this section was the invitation to 
"see what happens if we multiply by e x ." If g(x) is the exponential generating 
function for the sequence of derangement numbers {£)(«)} (with D(0) = 1), 
"see what happens" if you multiply by e x . 

14 Show that the derangement numbers satisfy 

n\ ~ [ > (n-l)\ + (n-2)! ' 
(Hint: Exercise 13, Section 2.3.) 

15 Let g(x) — 1 + |x 2 + ix 3 + |x 4 + |ix 5 + • • • be the exponential generating 
function for the derangement numbers. Use Exercise 14 to prove that 

(1 -x)g'{x) =xg{x). 

16 Use Exercise 15 as the basis for another proof of Theorem 4.4.5. 

17 Find a closed formula for the two-variable generating function 

/(ij)^^ck»)//. 

m>0 «>0 

18 Let g(x) = X)«>o b e me ordinary generating function for the sequence 
{««}• 

(a) Show that the "discrete derivative", Ag(x) = g(x + 1) - g(x), is the 
ordinary generating function for the sequence {£>„} defined by 



b n — ^fl,„C(m,«), n > 0. 



(b) Show that the ordinary generating function for the difference sequence 
{Aa n } is [(1 -x)g(x) - a 0 ]/x. 
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19 Tabulate the values of u(w), 13 < n < 27. 

20 Prove that the Euler totient function (Definition 2.3.9) is a multiplicative 
number-theoretic function (Hint: Theorem 2.3.11.) 

21 This exercise involves the Euler totient function from Definition 2.3.9. 

(a) Prove that n = J2d\„ <P( d )« 

(b) Confirm the formula in part (a) when n = 6. 

(c) Confirm the formula in part (a) when n = 10. 

(d) Prove that cp(«) = nJ2d\ n \^{d)/d. 

(e) Confirm the formula in part (d) when n = 6. 

(f) Confirm the formula in part (d) when n = 10. 

22 Prove that n/cp(n) = J2d\n ^(d) 2 /q>(d), where cp is the Euler totient function 
from Definition 2.3.9. 

23 If n is a positive integer, prove that ^2 d \ n \i(d) = 8 nj i. 

24 Let f(n) = 1 if n is "square free" (not divisible by the square of any 
prime) and zero otherwise. Prove that / is a multiplicative number-theoretic 
function. 

25 Prove that J2k\n \i{n/k)d(k) = 1, n > 1, where d(k) is the number of divisors 
of k. 

26 If / is a multiplicative number-theoretic function, show that the number- 
theoretic function g defined by g(n) — ^2^ n f(d), n> 1, is multiplicative. 

27 Let a\,a2,...,a m be fixed but arbitrary real numbers. Let M„ — a" + 
a" + ■ ■ ■ + a n m be their nth-power sum. 

(a) Let Eo = 1 and E n = E n {a\,ai, . . . ,a m ), n> 1, be the wth elementary 
symmetric function of the a's. Prove that e(x), the ordinary generating 
function for {E n }, satisfies the identity e(x) = exp[^ n>1 (— 1)" +1 (M„/«)x"]. 

(b) Let H Q = 1 and H n = H n (ai, a 2 , ... , a m ), n > 1, be the rath homogeneous 
symmetric function of the a's, i.e., 

H„ = y^Af g (ai,q 2 , . . .,a m ), 

the sum, over all partitions a of n having at most m parts, of the minimal 
symmetric polynomial M a . Prove that h(x), the ordinary generating 
function for {H n }, satisfies the identity h(x) — exp[J2 n >i(M„/n)x n ] . 

28 Prove Lemma 4.4.15. 

29 Starting with bo = 1, the Bernoulli numbers satisfy the implicit recurrence 
J2"=o C( n + 1' r )br — 0, n > 1. Show that the exponential generating function 
for the Bernoulli numbers has the closed formula g(x) — x/(e x - 1). 
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30 Say that the permutation p G S„ fluctuates if the integers in the sequence 
p = (p(\),p(2), . . . ,p(n)) alternately rise and fall, i.e., if p(2k — 1) < p(2k), 
1 < k < [n/2\, and p(2k + 1) > p(2k), 1 < k < |_(n - 1)/2J. In sequence 
notation, the five "fluctuating" permutations in S4 are (1,3,2,4), 
(1,4,2,3), (2,3, 1,4), (2,4, 1,3), and (3,4, 1,2). Denote the number of fluc- 
tuating permutations in S n by i;„. Setting ^ 0 = 1, the first few terms of the 
sequence {^„} of Euler numbers are 

1,1,1,2,5,16,61,... 

(a) List the 16 fluctuating permutations in S 5 . 

(b) The Euler numbers obey the recurrence E; 0 = ^ = 1 and 

n 

2 L+l = C ("' r )^rL-n n>l. 

r=0 

Use this relation (along with . . . , ^ 6 given above) to show that ^ 7 = 272. 

(c) Show that the exponential generating function for {^ n } has the closed 
formula sec(x) + tan(x). (Hint: Use part (b) with the Maclaurin series 
expansions of secant and tangent.) 

31 (L. Lovasz) Let s Q — 1 and s n — YTr=\ r ^{n, r),n> 1. 

(a) Show that s n , n> 1, is the number of functions 

/ : {l,2,...,f!}-{l,2,...,n} 

that are onto {1,2, ... ,r} for some re { 1 , 2, . . . , «}. 

(b) Show that s n — J2r>i C( n > r ) s n-r, n > 1. 

(c) Letting g{x) — ^ n>() s n x n /n\ be the exponential generating function for 
{s n }, show that g(x) = (2 - c*) _1 . 

(d) Show that (2 - e*)" 1 = 5Ei>o(^/2)*- 

(e) Show that (2 - e*y l = E n >oE k >o(k n /2 k+l )x" /nl 

(f) Show that YTn=i n\S(m,n) = ^ r m /2 r+l . 

(g) Show that YlLi n\S{3,n) = 13. 

(h) Write a computer program based on the following algorithm and use it to 
approximate the right-hand side of the equation in part (f) when m = 3: 

1. M= 3 . 

2. K = 100. 

3. S= 0. 

4 . For R = 1 to K 

5. S= S+ R M /2 r+1 . 

6 . Next R. 

7 . Return S . 
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32 Let W(n) be the number of w-letter "words" that can be made from the 
alphabet A = {r, s, t} subject to the conditions that letter r can be repeated at 
most three times, s at most twice, and t at most once. 

(a) Show that W(5) = 60. 

(b) Show that the exponential generating function for {W(n)} is 
(1 +x/V. +x 2 /2\ +x 3 /3!)(l +x/V. +x 2 /2!)(l +x/l\). 

(c) Compare and contrast with Exercise 40, Section 4.3. 

33 Let c(n) be the number of n-letter words that can be made from the alphabet 
{N, D, Q} subject to the conditions that N can occur at most 10 times, D at 
most 5 times, and Q at most twice. Find a closed formula for the exponential 
generating function for {c(n)}. 



4.5. RECURSIVE TECHNIQUES 

Fibonacci numbers and the golden ratio are ubiquitous in nature. The number 
( 1 + \/5)/2 seems an unlikely candidate for what is arguably the most important ratio 
in the natural world, yet it possesses a subtle power that drives the arrangements of 
leaves, seeds, and spirals in many plants from vastly different origins. 

— Michael Naylor (Mathematics Magazine) 

Encountered frequently in the exercises, the Fibonacci numbers are defined by 
F 0 = 1, F\ = 1, and F n = F n _\ + F n -2, n > 2. The first few terms of the Fibonacci 
sequence are 

1,1,2,3,5,8,13,21,34,55,... 

It was the French number theorist Edouard Lucas who suggested naming these 
numbers after Leonardo of Pisa, also known as Fibonacci. Indeed, the sequence 

1,3,4,7,11,18,29,47,76,123,... 

defined by Lq = 1, L\ = 3, and L n = L„_i + L„_2, n > 2, has come to be known as 
the Lucas sequence. 

The descriptions of these sequences have two elements. One consists of initial 
conditions that explicitly prescribe the first few terms. The second is a recurrence 
by means of which the remaining terms are determined inductively. Roughly speak- 
ing, a recurrence for {a n } is a formula for CI/J 3.S 3. function of previous terms. The 
Fibonacci and Lucas sequenes, e.g., obey the recurrence a n — a„_i + a„_2, n > 2. 

'Starting as early as Section 1.2. 
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4.5.1 Example The first few terms of the sequence {a n } defined by initial con- 
ditions ao = 0, a\ = 1, and recurrence a„ =a„-i — a n - 2 , n > 2, are 

0,1,1,0,-1,-1,0,1,1,... 

If {b n } is the sequence defined by the same recurrence, b n — b n -\ — b n - 2 , n > 2, 
and the boundary conditions b\ — 1 and bj, = 2, then 

b 2 = b 1 - b Q 

= 1- /70 

and 

2 = b 3 
= b 2 -b 1 
= (1 - bo) ~ 1 
= -b 0 . 

So, foo~ — 2, i>2 = 3, and the first few terms of {b n } are 

-2,1,3,2,-1,-3,-2,1,3,... 

What about defining {b n } using the same recurrence, but with boundary conditions 
bo = 0 and b 3 = 1? In this case, b 2 = b\ — bo = b\ and b 3 = b 2 — b\ = 0 =^ 1. In 
other words, there is no such sequence! □ 

Initial conditions are special kinds of boundary conditions that specify the first 
few consecutive terms of a sequence. For the remainder of this section, we will 
focus exclusively on sequences prescribed by initial conditions and a recurrence. 

Recall, from Equation (4.21) in Section 4.2, that a homogeneous linear recur- 
rence with constant coefficients is a relation of the form 

a n = c x a n -\ + c 2 a n - 2 H h c k a n - k , n > k, (4.70) 

where k is a fixed positive integer and c\,c 2 , . . . , c* are constants. 

4.5.2 Example. As we saw in Theorem 2.2.7, the Bell numbers satisfy the recur- 
rence B Q = 1 and 



n-l 

B » = J2 c ( n - 1 > r ) B " n ^ 1 - 

r=0 
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While it is homogeneous, this recurrence fails to be linear because the number of 
summands on the right-hand side is not constant. Moreover, because it depends 
on n, binomial coefficient C(n — l,r) is not constant in the sense of Equation (4.70). 

□ 

4.5.3 Theorem. {«„} satisfies the homogeneous linear recurrence 
a n = C\d n -\ + c 2 a n _ 2 + • • • + c k a„_ k , n > k, with constant coefficients, then the 
(ordinary) generating function 



f(x) = a nX" 
«>o 

has the closed formula f(x) — h(x)/q{x), where 

q(x) = 1 — C\X — C2X 2 — ■ ■ ■ — qi' 
and h(x) is a polynomial of degree at most k — 1. 

Proof. It follows from the recurrence that, for all n > k, the coefficient of x" in 

tfM/M =/W - c\xf(x) - c 2 x 2 f(x) c k x k f(x) 

is a„ - C\a n _\ - c 2 a„_ 2 c k a n _ k = 0. ■ 

In Section 4.2, partial fractions were used to convert 

^ 1 — C\X — c 2 x 2 — ■ ■ ■ — qi* ^ ^ 

into a form from which a solution (closed formula) for a n could easily be deter- 
mined. That technique depended upon being able to factor q(x) — 1 — C\x — 

C2X 2 — ■ ■ ■ — QX*. 

4.5.4 Example. Consider the sequence 

1,6,24, 84,276,... 

defined by ao = 1, a\ = 6, and a n — 5a„_i — 6a„_2, n > 2. Then q(x) = 1 — 5x+ 6x 2 , 
and it follows from Theorem 4.5.3 that the generating function for the sequence has 
the closed formula 



1 — 5x + 6x 2 
h{x) 

(1 - 2x)(l -3x)' 
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Because h(x) is a polynomial of degree at most 1, there exist constants s and t such 
that 

= s{l + 2x + 2 2 x 2 + 2 3 x 3 + •••) + t{\ + 3x + 3 2 x 2 + 3 3 x 3 + •••), 

so 

a n = s(2 n ) + t(3 n ), n>0. (4.72) 

So far, the initial conditions have not been used, i.e., any sequence that satisfies the 
recurrence a„ = 5a„_i — 6a„_2, n>2, is solved by Equation (4.72). Let's call it a 
general solution of the recurrence. 

Using the initial conditions ao = 1 and a\ = 6, we see that s and t in 
Equation (4.72) satisfy the simultaneous equations 1— s + t and 6 = 2.v + 3r, from 
which it follows that s=— 3 and f = 4. Hence, the solution to this particular 
sequence is 

On = -3(2") + 4(3"), n > 0. (4.73) 

(Confirm that Equation (4.73) produces the correct fifth number of the sequence, 
namely, 134 = 276.) □ 

The numbers 2 and 3 in Equation (4.72) came from the factorization q(x) = 1 — 
5x + 6x 2 = (1 — 2x)(l — 3x). They are the reciprocals of the roots of q(x). From a 
purely mechanical perspective, it seems more natural to work, not with q(x), but 
with the polynomial u(x) = x 2 q(l/x) = x 2 — 5x + 6 = (x — 2){x — 3), whose roots 
are 2 and 3. 

4.5.5 Definition. The characteristic polynomial afforded by the homogeneous 

linear recurrence a n — c\a n ^\ +C2fl„-2H h c^a„_^, n>k, is u{x)—x k — c\x k ^ 1 — 

c 2 x k ~ 2 c k -ix-c k . 



Beyond the "mechanical perspective", there is a better reason to introduce the 
characteristic polynomial. As the method of partial fractions shows, homogeneous 

linear recurrences of the form a n = c\a n -\ + C2fl„-2 H + Cka n -k are solved by 

linear combinations of exponentials. (See, e.g., the general solution in Equation 
(4.72).) But, in order for a n = r n to solve the recurrence, it is necessary that 

r n = c x r n - 1 + c 2 r n - 2 + ■ ■ ■ + c k r n - k . 
Upon dividing by r n ~ k and rearranging terms, this identity becomes 

rk _ Cl ^-i _ C2 ,*-2 Ct = 0) 



i.e., for a n = r" to solve the recurrence, r must be a root of u(x). 
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4.5.6 Theorem. Let {a„} be a sequence determined by the initial values 
ao,a\, . . . ,cik-\ and the homogeneous linear recurrence a n = c\a n -\ + C2«„-2 + 
• • • + Ckd n -k> n>k. If the distinct roots r\, r 2 , . . . , r s of the corresponding charac- 
teristic polynomial u(x) have multiplicities m\,m 2 , ■■ ■ ,m s , respectively, then there 
exists a polynomial p t of degree at most m, — 1, 1 < i < s, such that 

a„ = pi {n)r'l + p 2 (n)r" 2 + ■■■+ p s {n)r" s , n>0. (4.74) 



Proof. From Theorem 4.5.3 and Definition 4.5.5, the generating function for {a n } 
is 



J(X> (\- nx )"»(l- r2X )"*...(\-r s xy 



where the degree of h(x) is less than mi + n%2 + ■ ■ ■ + m s = k. It follows from the 
theory of partial fractions that/(x) can be written as a sum of expressions, each of 
the form 

l-rx (\-rxf (l-rx) 

where r = r, and m = m,-, 1 < i < s. By the binomial theorem for negative expo- 
nents (see, e.g., Equation (4.29)), 

(1 -rx)~' = Y^C{n + t- l,«)r"x". (4.76) 

n>0 

Because C(n + t — 1, n) = C(n + t — l,t— 1), it follows from Equation (4.76) that 
the coefficient of x n in Equation (4.75) is 

[b x C(n+\ - 1,0) + b 2 C{n + 2- 1,1) + ■■■+ b m C(n + m- l,m- l)]r". 

It remains to observe that 

p(n) = b x C{n + 1 - l,0) + b 2 C(n + 2- 1, 1) H \-b,„C(n + m- l,m- 1) 

is a polynomial in n of degree (at most) m — 1 . ■ 

4.5.7 Corollary. Let {a n } be a sequence determined by the initial values ao,ai, 

. . . and the homogeneous linear recurrence a n = c\a n _\ + c 2 a n _ 2 + V 

Ck.a n -k, n> k. If the roots r\,r 2 , . . . ,r* of the corresponding characteristic polyno- 
mial all have multiplicity 1, then there exist constants p it 1 < i < k, such that 



a„ = pir" + p 1 r n 1 -\ h Pk^, n>0. 
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Proof. A polynomial of degree 0 is a constant. ■ 

4.5.8 Example. Suppose ao = 3, a\=2, a2 = 4, and a n = 2a n -\ + a„-2 — 2a„-3, 
n > 3. Then the first few terms of {a„} (check them) are 

3,2,4,4,8,12,24,44,... 

From the characteristic polynomial u(x) ~x 3 — 2x 2 — x+2 = (x+ 1)(jc — 1)(jc — 2), 
we obtain the general solution a„ = p\ ( — 1 ) " +P2 1" +P3 2" . Together with the initial 
conditions «o = 3, «i =2, and 02 =4, this leads to the system of equations 

3 =P\ +Pi +P3 
2= -pi +p 2 + 2p 3 

4 =P\ +P2 + 4p 3} 

the solution to which is p\ =|, pi = 2, and P3 = j. Thus, 

«„-|(-l)" + 2 + i2" 
= 2 + f[2"- 1 + (-l)"]. 

(Confirm that this formula yields a 7 = 44.) □ 

4.5.9 Example. Consider the sequence {«„} defined by ao = 1 1, fli = 6, 02 = 18, 
133 = 104, a 4 = 346, and 

a„ = 6a„_i - 13fl„_ 2 + 14a„_ 3 - 12a„_ 4 + 8a„_ 5 , n > 5. (4.77) 

This time the characteristic polynomial is 

u(x) = x 5 - 6x 4 + 13x 3 - 14x 2 + 12k - 8 

= (x-2) 3 (x- i)(x + i). (4.78) 

(While it may not be easy to obtain the factorization in Equation (4.78), how hard 
can it be to check and see that it is correct?) It follows from Equation (4.78) and 
Theorem 4.5.6 that 

a n = p(n)2" + ci" + d(-i) n , (4.79) 

where p(n) = rn 2 + sn + 1 is a polynomial of degree at most 2. From the initial con- 
ditions (successively substitute n = 0,1,2,3, and 4 into Equation (4.79)), we obtain 
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the following system of five equations in five unknowns: 



11 
6 

18 
104 
346 



t + c + d 



2r + 2s + It + ic — id 

I6r + 8s + At - c - d 

12r + 24s + 8f - ic + id 

256r + 64s + 16r + c + d, 



the solution to which is r = s = t=\ and c — d — 5. (Is it easier to solve the system 
on your own or to confirm that this solution is correct?) Thus, 



(Before going on, check to see that Equations (4.77) and (4.80) yield the same value 
for as, namely, 992.) 

On reflection, we worked harder than necessary to obtain Equation (4.80). From 
the initial conditions and recurrence, it is clear (for this sequence at least) that a n is 
real, for all n > 0. Thus, the fact that c and d are equal (but not that their common 
value is 5) should have been obvious from Equation (4.79). Instead of solving five 
equations in five unknowns, the problem could have been reduced to solving four 
equations in four unknowns. □ 

A linear recurrence with constant coefficients is a relation of the form 



where A; is a fixed positive integer, c\,C2,- ■ ■ , are constants, and w(n) is some 
function of n. Thus, a linear recurrence is homogeneous if and only if w(n) = 0. 

4.5.10 Example. Consider a recurrence of the form a„ =a n -\ + w(n), n>l, 
where w(n) is a polynomial of degree r in n. Then w(n) is the nth number in the 
difference sequence Aa n —a n+1 — a n . By Theorems 4.1.8 and 4.1.10, there is a 
polynomial p(x), of degree at most r+ 1, such that a n =p(n) for all n > 0. 

To take a specific example, suppose a$ — 3 and a n = a n -\ + 2n 2 — n + 1, n > 1. 
Then the difference array for {a n } is 



a„ = (n 2 + n+ 1)2" + 5f + 5(-/)". 



(4.80) 



a„ = Cifl„_i + c 2 fl„_ 2 H h c k a n _ k + w(n) 



n> k. 



(4.81) 



3, 5, 12, 28, 57, 103 ... 
2, 7, 16, 29, 46, ... 

5, 9, 13, 17, ... 

4, 4, 4, ... 
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So, again from Theorem 4.1.10, 

a n = 3C(n, 0) + 2C(n, 1) + 5C(n, 2) + 4C(n, 3) 
= 3 + 2n + f (n 2 - n) + f (n 3 - 3n 2 + 2n) 
= \n 3 +\n 2 +\n + 3. (4.82) 

(Before going on, check to see that Equation (4.82) produces the correct results for 
n= 1,2, and 3.) □ 

For more complicated linear recurrences, we turn to the so-called method of 
undetermined coefficients, a fancy name for guess and check. 

4.5.11 Example. Consider the sequence 

5,1,34,39,226,415,... (4.83) 

defined by «o = 5, a\ = 1, and a n = a n -\ + 6a„_2 — 6m 2 + 26n — 25, n > 2. If it were 
not for the term 6a„_2, we could use the method of Example 4.5.10, expecting 
the solution to be a polynomial of degree 3 in n. If w(n) — — 6« 2 + 26« — 25 were 
zero, the characteristic polynomial x 2 — x — 6= (x — 3)(x+2) would lead us to 
expect a solution of the form s3" + t(—2) n . The idea behind the method of undeter- 
mined coefficients is to look for a solution of the form 

a n = s3 n + t{~2f + an 3 + bn 2 + cn + d. (4.84) 

This leads to the system of equations 

5 = do = s + t + d 

1 = a\ = 3s — 2t + a + b + c + d 

34 = a 2 = 9s + At + 8a + 4b + 2c + d 

39 = a 3 = 27s - 8f + 27a + 9b + 3c + d 

226 = a 4 = 81s + 16f + 64a + 16b + 4c + d 

415 = a 5 = 243s - 32t + 125a + 25b + 5c + d 

whose solution is s = 2, t = 3, b=l, and a = c = d = 0. So far, so good. We have 
shown that if the solution to Sequence (4.83) has the form given in Equation (4.84), 
then 

a n = 2(3)" + 3(-2)" + « 2 , n>0. (4.85) 

We know that the sequence denned by Equation (4.85) satisfies the initial condi- 
tions ao = 5 and a\ = l. (These initial conditions gave us the first two of our six 
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equations.) If we can show that it also satisfies the recurrence a„=a„_i + 
6a„_2 — 6n 2 + 26w — 25, n>2, we will be finished. Let's check it out. 
From Equation (4.85), 

a„_i = 6(3)"~ 2 - 6(-2)"~ 2 + n 2 - In + 1, n > 1, 

6a„_ 2 = 12(3)"~ 2 + 18(-2)"~ 2 + 6w 2 - 24« + 24, n > 2. 

Adding the sum of these two equations to — 6w 2 + 26w — 25 gives 

18(3)"~ 2 + 12(-2)"~ 2 + n 2 = 2(3)" + 3(-2)" + « 2 

= fin- 
Therefore, Equation (4.85) solves Sequence (4.83). □ 
4.5.72 Example. Consider the sequence defined by a 0 = 9, a\ = 17, a 2 = 24, and 
a n = 4a„_i - 5a„_2 + 2a„_ 3 + 6« - 20, n > 3. (4.86) 
The characteristic polynomial of the homogeneous part is 

x 3 -4x 2 + 5x-2= (x- l) 2 (x-2), (4.87) 
which suggests guessing a solution of the form 

a„ = rT + (sn + t) 1" + an 2 + 6n + c. 

Because 1" = 1, w > 0, we may as well combine sw + f with bn + c and guess a solu- 
tion of the form 

a n = r2 n + an 2 + bn + c. (4.88) 

After using the initial conditions and Equation (4.86) to compute ay, =27, we are 
led to the following system of four equations in four unknowns: 

r + c = 9 

2r + a + b + c = 17 

4r + 4a + 2b + c = 24 

8r + 9a + 3b + c = 21, 

the solution to which is r=— 3, a=\, £>=10, and c=12. (Check it.) So, if the 
sequence has a solution of the form given in Equation (4.88), that solution is 

On = -3(2") + n 2 + 10m + 12. (4.89) 
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As confirmed by the first three of our four equations, the sequence defined by 
Equation (4.89) satisfies the right initial conditions. However, computations show 
that this sequence satisfies 

a n - 4a„_i + 5a„_2 - 2a„_ 3 = -2 

+ 6n - 20. 

In other words, Sequence (4.89) fails to satisfy Recurrence (4.86), i.e., the (original) 
sequence is not solved by Equation (4.88). The correct solution turns out to be 

a n = 3(2") - n 3 + n 2 + 5n + 6, n > 0. (4.90) 

(Confirm that the first few terms of the sequence given by this formula are 
9,17,24,27,26,....) □ 

What went wrong in Example 4.5.12? In one sense, nothing] There is, after all, 
no a priori guarantee that guesses always check. In this particular case, a better 
guess would evidently have been a n = r2 n + an 3 + bn 2 + cn + d, i.e., r2 n plus a 
polynomial of degree three. Hold that thought. 

4.5.13 Example. Let {b„} be the sequence defined by bo = 9, b\ = 17, /?2 = 24, 

b } =27, b 4 = 26, and 

fo„ = 6Vi-14V2 + 16Z>„_3-9V4 + 2V5, n>5. (4.91) 

The characteristic polynomial of this homogeneous linear recurrence is 

x 5 - 6x 4 + 14x 3 - 16x 2 + 9x - 2 = (x - l)\x - 2), 

which, by Theorem 4.5.6, means a solution of the form 

b n = r2 n + (an 3 + bn 2 + cn + d) 1" 

= r2" + an 3 + bn 2 + cn + d. (4.92) 

Solving for the undetermined coefficients yields 

b n = 3(2") - n 3 + n 2 + 5n + 6. (4.93) 

□ 

Despite the fact that Recurrences (4.86) and (4.91) are dramatically different, 
Equations (4.90) and (4.93) show that {a n } = {b n }, i.e., the sequences them- 
selves are identical! This coincidence bears on why our guesses were successful 
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in Examples 4.5.10 and 4.5.11 but not in Example 4.5.12. The difficulty can be 
traced to the multiplicity of x — 1 as a zero of the characteristic polynomial. 
The way to overcome this difficulty is to adjust our guesses, not by combining 
polynomial contributions as in Example 4.5.12, but by adding their degrees. 

4.5.14 Rule. Let {a n } be a sequence determined by the initial values 

a Q ,a l} ...,a k _i and linear recurrence a n = C\a n _\ +c 2 a n _ 2 -\ hc^a„_j + w(n), 

n>k, where w(n) is a polynomial in n of degree d. If the distinct roots 1 = ro, 
ri,r2,...,r s of the corresponding characteristic polynomial u(x)=x k — 

c\x k ~ x — C2X k ~ 2 q have multiplicities mo,m\,m2,...,m s , respectively, then 

there exist polynomials p\ of degree at most m, — 1, 1 < i< s, such that 

a„ =pi(n)r n 1 + p 2 {n)r Jl 2 + ■ ■ ■+p s ^+p(n), « > 0, 
where p(n) is a polynomial of degree at most d+ntQ. 

Before going on to the next idea, check to see that the solutions in Examples 4.5.10- 
4.5.13 are consistent with Rule 4.5.14. 

4.5.15 Example. Consider the sequence {«„} defined by ao = 3 and a n = 3«„_i + 
2(5"~'), n> 1. This time w(n) —2(5 n ^ 1 ) is not a polynomial in n. What do we 
do now? Why not try guess and check? The general solution to the homogeneous 
part, namely, a„ = 3a„_i, is a n = c3". Might the solution have the form a„ = c3"+ 
b5"~', «>0? We might just as well take d = b/5 and look for a solution of the 
form 

a n = c3 n + d5 n . (4.94) 

Because there are two unknowns, we should look for two equations. The initial con- 
dition ao = 3 gives one, and setting n—l in the recurrence yields «i=3ao + 
2(5°)= 11. It follows from 

c + d= 3 
3c + 5d= 11 

that c — 2 and d = 1 . 

Once again, if there is a solution of the form given in Equation (4.94), then it 
must be 

a„ = 2(3")+5", «>0. (4.95) 

Let's check it out. First, the sequence defined by Equation (4.95) satisfies the initial 
condition a Q ~3; after all, that is where the equation c + d=3 came from. Thus, it 
remains to verify that Sequence (4.95) satisfies the recurrence a n — 3a„_i+ 2(5"~' ), 
«>1. But, a„_ 1 =2(3"- 1 ) + 5"- 1 implies that 3a„_i + 2(5"- 1 ) = 6(3 n " 1 ) +5(5"" 1 ) 
^2(3")+5" = a„. □ 
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4.5.16 Example. Consider the sequence 

5,17,57,189,621,... 

denned by «o = 5 and a n = 3a„_i + 2(3"~ 1 ), n > 1. Following the approach of 
Example 4.5.15 would lead to a guess of the form a n = c3" + d3 n , n > 0, which 
can be expressed more simply as a n = b3", n > 0. From the initial condition, 
5 = do = b3°, we see that b = 5. Thus, our guess becomes a„ = 5(3"), n > 0. Since 
a\ = 3 x 5 + 2 x 3° = 17 15 = 5 x 3 1 , this guess fails to check out. The solu- 
tion we seek is not of the form a n = b3 n . 

As in the discussion leading to Rule 4.5.14, the difficulty arises from an overlap 
between w{n) and the general solution to the homogeneous part. Let's try to mimic 
Example 4.5.13 and design a sequence {£>„} with initial conditions bo = 5, b\ = 17, 
and a homogeneous recurrence with characteristic polynomial u(x) = (x — 3) 2 = 
x 2 — 6x + 9, i.e., b n — 6b n -\ — 9b n -2- Then, from Theorem 4.5.6, b„ = (cn + d)3 n , 
n>0. Together with the initial conditions, this leads to the simultaneous equations 

d = 5 
3c + 3d= 17, 

the solution to which is c = | and d = 5, i.e., b„ = 2n(3 n ~ l ) + 5(3"), n>0. The con- 
firmation that {«„} = {b n } is left to the reader. □ 



4.5. EXERCISES 

1 Consider the Lucas sequence {L n } defined on p. 320. 

(a) Compute \I? n — L n -\L n+ \ \ for several values of n. 

(b) Make a conjecture about the sequence whose nth term is \l} n — L„_i L n+1 1, 
n > 1. 

(c) Does the Fibonacci sequence exhibit a similar property? 

(d) Can you prove your conjecture in part (b)? 

(e) Ratios of successive Fibonacci numbers were the subject of Exercise 6(d), 
Section 4.2. What can be said about the ratio of successive Lucas numbers, 
L n+ \/L n , as n increases? 

2 Find a closed formula for a n if 

(a) «o = 0 and a„ = a n -\ + n, n > 1. 

(b) flo = 0 and a n = a n -\ + n 2 , n > 1. 

(c) «o = 0 and a n = a„_i + w 3 , n > 1. 

(d) ao = 0 and a n — na n -\, n > 1. 
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3 Find a closed formula for a n if 

(a) flo = 0, a\ = 1, and a n — 5fl„_i — 6a„_2, n>2. 

(b) do = 2, a\ = 5, and a„ = 5a„_i — 6fl„_2> n > 2. 

(c) «o = 2, a\ = 9, and a„ = 5a„_i — 6a„_2, n>2. 

4 Find a closed formula for a n if 

(a) flo = 1, «i = 2, «2 = 6, and a„ = 6a„_i — llfl„-2 + 6a„_3, n > 3. 

(b) ao = 1 , fl| = 0, «2 = 6, and a„ = 4a„_i — a„_2 — 6a„_3, w > 3. 

(c) ao = 3, a\ = 4, «2 = 14, and a n = 4a„_i — a„_2 — 6a„_3, n > 3. 

5 Find a closed formula for a n if 

(a) «o = 3, a,\ = 9, «2 = 16, and a„ = 4a„_i — 5a„_2 + 2a„_3, « > 3. 

(b) ao = 1, fli = 6, «2 = 28, and a n = 6a„_i — 12a„_2 + 8a«-3, « > 3. 

(c) ao = 1, a\ = 8, 02 = 36, and a n = 6a„_i — 12a„_2 + 8a„_3, n > 3. 

6 Confirm that a„ = 3(2") - n 3 + n 2 + 5n + 6, n > 0, solves the sequence in 
Example 4.5.12. 

7 Find a closed formula for a n if 

(a) flo = 3, and a„ = a„_! + 3n — 2. 

(b) ao = 1, and a n = a n -\ + 2n — 3. 

(c) ao = 2, and a n = a n -\ + n 2 + 1. 

8 Find a closed formula for a n if 

(a) ao = 3, and a„ = 3a„_i + 4"/2. 

(b) ao = 1, and a n = 2a„_i + 4" _1 . 

(c) ao = 2, and a„ = 3a„_i — An + 3(2"). 

9 Find a closed formula for a n if 

(a) flo = 2, ai = 9, and a n = 6a„_i — 9a„_2- 

(b) ao = 2, and a n = 3a„_i + 3". 

10 Finish Example 4.5. 16 by confirming that the sequence {a n } defined by «o = 5 

and a n = 3fl„_i + 2(3""'), n > 1, is solved by a n = 3" _1 (2«+ 15), n > 0. 

11 Solve the sequence {a„} defined by 

(a) flo = 6, a\ = 4, and a n = a„_i + 6fl„_2 + 2", n> 2. 

(b) «o = 4, «i = 7, and a„ = — a„_i + 6a„_2 + 5(2"), « > 2. 

(c) ao = 4, ai = 7, «2 = 37, and a n = a„_i + 8a„_2 — 12a„_3, n > 3. 

12 The 7bwr 0/ Hanoi puzzle was introduced by Professor Claus in 1983. It 
consists of three vertical rods of the same diameter and n circular disks of 
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Figure 4.5.1. Tower of Hanoi. 



different diameters with holes punched from their centers so that they can be 
slipped over the rods. In their initial position, the disks are arranged on one of 
the rods in the shape of a tower. (See Fig. 4.5.1.) A move consists of removing 
the top disk from one rod and transferring it to the top position on another, 
subject to the condition that no disk can ever sit on top of a smaller one. The 
object of these moves is to transfer the entire tower from the initial rod to one 
of the other two rods, one move at a time. 

Denote by T n the minimum number of moves required to transfer an w-disk 
tower from one rod to another. 

(a) Prove that the sequence {T n } satisfies the conditions To = 0 and 

T n = 2r„_! + 1, n > 1. 

(b) Find a closed formula for T n . 

(c) If one disk is moved each second, 24 hours a day, seven days a week, 
without making any mistakes, approximately how many centuries will it 
take to transfer a 64-disk tower? (Hint: 2 10 =10 3 .) 

13 Find a closed formula for L n , the wfh Lucas Number. 

14 Suppose the monks in some monastery undertake the task of tossing a gold 
coin, believing that the monastery will vanish into hyperspace the moment two 
successive tails are tossed. Let P{n) = a n /b n be the probability that successive 
tails occur for the first time on the (n — 1) st and rath tosses of the coin. 

(a) Explain why we may take b„ — 2". 

(b) Explain why oq = a\ = 0. 

(c) Prove that a n+ 2 — F n , the nth Fibonacci number, n > 0. 

(d) If/0) = J2n>o a n+2X n , prove that/0) = 1/(1 - x- x 1 ). 

(e) Prove that J2 n >o p ( n ) = !• (Hint: 1 ' Show that the sum is where f(x) 
is the generating function from part (d). What implications does this 
probabilistic result have for the monks?) 

*In 1884, de Parville published a two-page paper in La Nature, revealing that Claus is the anagrammatic 
pen name of (Edouard) Lucas. According to de Parville, a group of Tibetan monks is presently working in 
a secret monastery to transfer a tower of 64 golden disks. As de Parville tells the tale, the world will end in 
a thunderclap the moment the monks finish their task. 
t S. Kennedy and M. Stafford, Math. Mag. 67 (1994), 380^382. 
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Figure 4.5.2 



15 If two sequences satisfy the same linear recurrence, show that they differ by a 
solution to the corresponding homogeneous recurrence. 

16 Let {c n } be the sequence defined by cq = 1 and c n+ \ = Y^=o c r c n-r> n > 1. 

(a) If f(x) = J2n>o c "^ 1 i s me generating function for {c n }, prove that 
xf(xf =f(x) - 1. 

(b) Deduce from part (a) that/(x) = [1 - (1 - 4x) 1/2 ]/2x. 

(c) Prove that c„ = C(2n,n)/(n +1), the wth Catalan number from Exercise 
13, Section 1.2. {Hint: Newton's binomial theorem. Compare and contrast 
with Exercise 16, Section 1.2.) 

17 Say that n lines in the plane are in general position if no two of them are 
parallel and no three of them are concurrent (incident with a single point). 
Apart from the lines themselves, n lines in general position partition the plane 
into some number r n of regions. It is clear, e.g., from Fig. 4.5.2, that ro = 1, 
r i = 2, and r 2 = 4. 

(a) Show that r 3 = 7. 

(b) Prove that the sequence {r„} satisfies the linear recurrence r„ = r„_i+ n, 
n>\. 

(c) Find a closed formula for r n . 

18 Prove the converse of Theorem 4.5.3. 

19 Consider a sequence {a n }, with fixed but arbitrary initial conditions, oq, 

ai, . . . ,Ok-\, and homogeneous linear recurrence a n = c\a n -\ + cia n -i H h 

Qd„_{, n> k. Let vj be the k x 1 column vector whose ith entry is a/+,-_i, i.e., 



vj = (aj,a j+u . . .,a j+k -iY, j > 0. 
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Finally, let M be the k x k companion matrix 



M = 



/o 


1 


0 . 


■ °\ 


0 


0 


1 


. 0 


0 


0 


0 . 


. 1 




Ck-l 


Ck-2 ■ 


• cj 



(a) Show that Mvj — v j+l ,j > 0. 

(b) Show that the characteristic polynomial of M is u(x) =x k — c\x k ~ l — 
C2X k ~ 2 — ... — Ck-\X — q, the characteristic polynomial of the sequence. 

(c) Suppose a n = r", 0 < n < k, where r is a real root of u(x). Using parts (a) 
and (b), prove that a n = r" for all n > 0. 
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By convention there is color . . . but in reality there are atoms and space. 

— Democritus 

The material in Chapter 5 has been selected from those topics in graph theory that 
afford an opportunity to discuss a combinatorical technique, like the pigeonhole 
principle; that exhibit an important combinatorial application, such as using 
Ferrers diagrams to characterize graphic sequences; or that involve a particularly 
nice example of combinatorial enumeration, e.g., the theory of chromatic poly- 
nomials. 

Apart from the pigeonhole principle, Section 5.1 introduces graph isomorphism 
and illustrates the notion of an invariant using degree sequence and number of 
connected components as examples. The theme of edge colorings is used in 
Section 5.2, first to introduce the basic elements of Ramsey Theory and then to 
count nonisomorphic graphs. Readers who omitted Section 3.7 should either skip 
all of Section 5.2 or just the material beyond Theorem 5.2.5. 

Stirling numbers of the first kind are seen, in Section 5.3, to be coefficients in 
chromatic polynomials of complete graphs. The notion of a proper coloring leads to 
bipartite graphs and trees. 

In Section 5.4, counting things in planar graphs leads to Euler's formula relating 
numbers of vertices, edges, regions, and components. By using this discussion as a 
pretext to prove the five-color theorem, the text strays a bit from those topics in 
graph theory strictly related to combinatorial enumeration. Discipline is restored 
in Section 5.5, but only by choosing from the extensive theory of matchings just 
those topics related to the matching polynomial. 

Oriented graphs, Laplacian matrices, and the matrix-tree theorem are discussed 
in Section 5.6. The focus of the final section is on necessary and sufficient condi- 
tions for a partition of 2m to be the degree sequence of some graph, finishing with 
the connection between Laplacian matrices and threshold graphs, i.e., graphs whose 
degree sequences are maximal with respect to majorization. Techniques from 
elementary linear algebra are used extensively in Section 5.6. 
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Apart from some vocabulary on p. 348, Sections 5.2 and 5.4 are optional. Later 
sections do not depend on either of them. Except for the second half of Section 5.2 
(where the cycle index polynomial of the pair group is used to count nonisomorphic 
graphs), Chapter 5 is independent of Chapter 3. Despite the fact that the words 
generating function are used twice (once each in Sections 5.2 and 5.7), Chapter 
5 is independent of Chapter 4. Finally, one might reasonably exit Chapter 5 either 
from the middle of Section 5.2 or at the end of any section. 

5.1. THE PIGEONHOLE PRINCIPLE 

Either through a sense of curiosity or to break up the tedium of a long flight, most 
air travel passengers eventually become acquainted with the contents of the seat 
pocket in front of them. Among the more interesting items to be found there is 
the airline's route map. On the typical map, a nonstop flight connecting cities u 
and v is illustrated by a line segment or arc joining the cities. Let's give a name 
to the number of segments/arcs that touch at u. Call it the degree of u. Would it 
surprise you to learn that, on any route map, there are always two cities that 
have exactly the same degree? This coincidence is a consequence of the following 
self-evident fact. 

5.7.7 Pigeonhole Principle. If n pigeonholes are occupied by more than n 
pigeons, then some pigeonhole contains more than one pigeon. 

Let's see what the pigeonhole principle has to do with airline route maps. We 
may assume that a total of k > 1 cities are represented on the map. It may happen 
that some city appears on the map even though it is not served by the airline; any 
such city has degree 0. At the other extreme, it might happen that a city is con- 
nected to every other city on the map. Any such hub will have degree k — 1 . Notice, 
however, that these two extreme cases cannot occur simultaneously. (If some city is 
connected to every other city, then there can be no city of degree 0.) So, among the 
k cities on any given map, at most k — 1 degrees are possible. In particular, there are 
always more cities (pigeons) than degrees (pigeonholes). 

Airline route maps afford just one illustration of the mathematical abstraction 
called a graph. Roughly speaking, a graph is a set of points some pairs of which 
are joined by arcs. To give a precise mathematical definition, let V be a set. Denote 
the family of its two-element subsets by V^ 2 \ Then, for example, {u, v, w}' 2 ' = 
{{ M ,v},{ M ,w},{v,w}};{1,2,3,4} (2) -{{1,2},{1,3},{1,4},{2,3},{2,4},{3,4}} 
and {x,v} (2) = {{x,y}}. If o(V) = n, then o(V^) = C(ra,2). 

5.7.2 Definition. A graph consists of two things, a nonempty finite set V and a 
(possibly empty) subset E of y< 2 ' . If G = ( V, E) is a graph, the elements of V are its 
vertices and the elements of E its edges. When more than one graph is under 
consideration, it may be useful to write V(G) and E(G), respectively, for the sets 
of vertices and edges. If e = {u 7 v} € E(G), the vertices u and v are said to be 
adjacent (to each other) and incident with e. Two edges are adjacent if they are 
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both incident with the same vertex, i.e., if their set-theoretic intersection consists of 
a single vertex. 

5.7.3 Example. If V = {1,2,3,4}, then V< 2 > has 6 elements and 2 6 subsets. 
Hence, there are 64 different graphs with vertex set V = {1,2,3,4}. 

It is common to draw pictures of graphs in which vertices are represented by 
points, and points representing adjacent vertices are joined by segments (or arcs). 
If £ = {{1,2},{1,3},{1,4},{2,3},{2,4}} = V< 2 '\{{3,4}}, then each of the 
four pictures in Fig. 5.1.1 illustrates G\ = (V,E). □ 

An airline route map consists of a graph superimposed on a geometric represen- 
tation of part of the Earth's surface. In such maps, the length of an arc is a rough 
indication of distance. This metric property makes a route map more than a graph. 
The length of an arc representing an edge of G has no graph-theoretic significance. 
An edge of a graph is a subset consisting of exactly two of its vertices. 

5.7.4 Example. Not only can one graph be illustrated by different pictures, but 
one picture can represent different graphs! If V 2 — {a,b,c,d} and E2 — {{a,b}, 
{a, c}, {b, c}, {b, d}, {c, d}}, then each picture in Fig. 5.1.1 (also) illustrates 

G 2 = (V 2 ,E 2 ). □ 

We are not so much interested in "different" graphs as in "nonisomorphic" 
graphs. 

5.7.5 Definition. Let G\ = (Vi,2si) and G 2 = (V 2 ,E 2 ) be graphs. Then G\ is 
isomorphic to G 2 if there is a one-to-one function / from V\ onto V 2 such that u 
and v are adjacent in G\ if and only if f(u) and /(v) are adjacent in G 2 , i.e., 
such that 

{u,v}eE 1 if and only if {f(u),f(v)}eE 2 . (5.1) 
The function / is called an isomorphism from G\ onto G 2 . 

5.1.6 Example. If G\ and G 2 are the graphs in Examples 5.1.3 and 5.1.4, respec- 
tively, then G\ and G 2 are isomorphic. If / : V\ — > V 2 is the function (b,c,a,d), 
i.e., if /(l) = b, f(2) = c, /(3) = a, and /(4) = d, then / is one of four isomor- 
phisms from Gi onto G 2 . □ 

If G\ and G 2 can be illustrated by the same picture, they are isomorphic, because 
to each point of the picture there corresponds a unique vertex V\ of G\ and a unique 



340 



Enumeration in Graphs 




Figure 5.1.2. Two illustrations of the Petersen graph. 



vertex V2 of G2. The function that sends v\ to V2 (for every point of the picture) is an 
isomorphism. It is much more challenging to tell when graphs illustrated by 
different pictures are isomorphic. 

5.7.7 Example. Consider the so-called Petersen graph G\, illustrated in 
Fig. 5.1.2. It is isomorphic to the graph G2, pictured in the same figure. The proof 
that G\ and G2 are isomorphic is by the numbers. If V(G\) = {1,2, ... , 10} = 
V(G2), then/(i) = i, 1 <i < 10, is an isomorphism. (Check it out. Confirm that 
i and j are adjacent in G\ if and only if i and j are adjacent in G2.) Such a pair 
of labeled figures may be considered a proof of isomorphism. (Provided, of course, 
that they check out.) □ 

One problem with picturing graphs by means of points and lines is that a line 
segment contains infinitely many geometric points, whereas an edge of a graph 
consistes of just two vertices. 

5.1.8 Example. Take another look at the illustration of graph G2 in Fig. 5.1.2. 
Note that, in the picture, edge {3,9} appears to cross edge {5,6}, yet these two 
edges have no vertex in common. □ 

It follows from the definition that isomorphic graphs have the same numbers of 
vertices and edges. Consequently, if G\ and G2 do not share these properties, they 
cannot be isomorphic. Properties like these, that isomorphic graphs must share, are 
called invariants. 

If G\ and G2 have the same number n of vertices, and the same number m of 
edges, then, in principle at least, the isomorphism problem involves sifting through 
n\ functions, looking for one that satisfies Condition (5.1). If n — 10, as in 
Example 5.1.7, this involves 10! = 3.6 million functions! It is one thing to verify, 
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by the numbers, that some given function is an isomorphism. It is something else 
entirely to identify an isomorphism among so many condidates! This troublesome 
prospect helps motivate the search for invariants. The more invariants we have, the 
better our chances of finding one for which G\ and G 2 differ, giving a back-door 
proof that the graphs are not isomorphic. One important invariant is the multiset of 
vertex degrees, a useful discussion of which depends on a proper definition. 

5.7.9 Definition. Let G = (V, E) be a graph. Suppose v G V. The degree of v, 
denoted d c (v), is the number of edges of G that are incident wth v, i.e., d c (v) is 
the number of vertices of G that are adjacent to v. 

When its meaning is clear, we will typically write d(v) in place of dc(v). Given a 
graph on n vertices, it is convenient to arrange the vertex degrees d(v\), 
d(v 2 ), . . . , d(v„) in a sequence. Define 

d(G) = (di,d 2 , ■ . . ,d n ), 

where d\ > d 2 > ■ ■ ■ > d n are the degrees of the vertices of G arranged in non- 
increasing order. (It need not be the case that d- t = rf(v,).) 

5.1.10 Theorem. The degree sequence d{G) is an invariant. 

Proof. Let / : V\ — > V 2 be an isomorphism from G\ = (V\,E\) onto G2 = 
(V 2 ,E 2 ). Since / is one-to-one, it suffices to show that d(f(v)) = d(v) for all 
v G V\. Because {m,v} G E x if and only if {/(m),/(v)} G E 2 , 

d(v) = o{{u G Vi : {u,v} G 7?i}) 

= o({f(u) €V 2 : {/(«),/(v)}€£ 2 }) 

= d(f(v)). m 

From rf(G), we can determine both n, the number of vertices of G, and m, the 
number of its edges: n is just the length of the sequence d(G), and m is given by the 
so-called first theorem of graph theory: 

5.1.11 Theorem. Let G = (V,E) be a graph with vertex set V = {v\,v 2 , . . . , 
v„}. Ifo(E) = m, then 

n 

J2d{vi)=2m. 

i=\ 

Proof. By definition, d(v) is the number of edges incident with vertex v. Thus, in 
summing d(v), each edge is counted twice, once at each of its vertices. ■ 

It is not uncommon in medieval literature for some character to be involved in a 
quest. If graph theorists had a quest, it would most likely be a short list of easily 
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computed invariants, sufficient to distinguish nonisomorphic graphs. For the 
moment let's observe that, by itself, d(G) can fail to distinguish nonisomorphic 
graphs. 

5.1.12 Example. The nonisomorphic graphs G\ and G2 of Fig. 5.1.3 share the 
degree sequence (2,2,2,1,1). □ 

5.1.13 Definition. Let G = (V, E) be a graph. Suppose u,w S V. A path in G of 
length r, from u to w, is a sequence of distinct vertices [vo, v\, . . . , v r ] such that 
vo = u, v r = w, and {v,-_i, v, } € £, 1 < i < r. Vertices u and w are in the same 
component of G if u — w or if u ^ w and there is a path in G from m to w. A graph 
with just one component is said to be connected. 

5.1.14 Example. In Fig. 5.1.3, G2 is connected but G\ is not. A little care should 
be taken with this notion. If G3 is the graph illustrated in Fig. 5.1.4, then G3 is not 
connected. In fact, G3 is isomorphic to G\. □ 

5.1.15 Theorem. Isomorphic graphs have the same number of components. 




G 3 

Figure 5.1.4 



Discussions of intractability frequently involve the class NP of decision problems that can be solved in 
polynomial time by a "nondeterministic" computer (a hypothetical device able to work on an unbounded 
number of independent computational sequences in parallel). In 1971, S. A. Cook proved that every 
problem in NP can be reduced to the "satisfiability" problem, making it the first W-complete problem. As 
of this writing, whether the graph isomorphism problem is NP-complete remains an open question. Among 
the best introductions to NP-completeness is (still) M. Garey and D. Johnson, Computers and Intractability: 
A guide to the Theory ofNP-Completeness, W. H. Freeman, San Francisco, 1979. 
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Proof. Let/ : V(G\) —* V(G2) be an isomorphism from G\ onto G 2 . Then [vo, 
Vi, . . . , v r ] is a path in G\ if and only if [/(v 0 ),/(vi), . . . ,/(v r )] is a path in G 2 . 
Thus, / maps the vertices of a component of G\ onto the vertices of a component 
of G 2 . ■ 



Suppose V = {vi, V2, . . . , v„} and W = {w\ , W2, ■ ■ ■ , w n }. Let S be the set of all 
graphs with vertex set V and T the set of all graphs with vertex set W. Define a 
function h : S -> T by h((V, E)) = (W, F), where F = { {w h Wj} : {v h Vj } G E}. 
Then, affords a one-to-one correspondence between S and T in which correspond- 
ing graphs are isomorphic. Thus, as far as the mathematics of graph theory goes, the 
nature of the vertices is immaterial. It doesn't matter whether they are cities on an 
airline route map, carbon atoms in a chemical molecule, or microprocessors in a 
parallel computer. In particular, it makes sense to talk about "the nonisomorphic 
graphs on n vertices". It doesn't matter which n vertices, just that there are n of 
them. 

5.1.16 Example. There are 1 1 nonisomorphic graphs among the 2 C ' 4,2 ' = 64 
different graphs on four vertices. They are illustrated in Fig. 5.1.5. □ 

A useful short-cut when making lists of nonisomorphic graphs involves the 
notion of a "complement". 

5.1.17 Definition. The complement of G=(V,E) is the graph G c = 
(V, V' 2 ' \ E). So, G and G c share the same vertex set, but {«, v} is an edge of G 
if and only if it is not an edge of G c . 

With one exception, the graphs in Fig. 5.1.5 are illustrated in complementary 
pairs. This is possible because G and H are isomorphic if and only if G c and H c 
are isomorphic. 

5.1.18 Example. The graphs illustrated in Fig. 5.1.6 are both complementary 
and isomorphic. □ 



If V = {vi, v 2 , . . . , v„} then K n = (V, V (2) ), the graph with all C(n, 2) possible 
edges, is the complete graph on n vertices. Its complement is the graph with n 
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Figure 5.1.6 



vertices but no edges at all. Thus, K c n is the graph having n components each 
consisting of a single isolated vertex. 



5.1. EXERCISES 

1 Suppose n is a positive integer. If S is a set of n integers, show that some subset 
of S sums to a multiple of n. 

2 Suppose 100 balls are distributed among 15 urns. Prove that some two urns 
contain the same number of balls. 

3 If both x and y are integers, the point P — (x, y) is a lattice point of the plane. 
Suppose P h 1 < (' < 5, are five (different) lattice points. Each of the C(5, 2) = 
10 pairs of these points determines a unique line segment. Show that the 
midpoint of (at least) one of these segments is a lattice point. 

4 Suppose k pigeonholes are occupied by r pigeons. Show that some pigeonhole 
contains at least \r/k] pigeons, where \x] is the smallest integer not less 



5 Consider n objects each of which weighs a (positive) integer number of grams. 
Suppose, taken all together, the objects weigh a total of 2n grams. If the 
objects do not all weigh the same, and none of them weighs more than n 
grams, prove that they can be partitioned into two piles of equal weight. 

6 Prove that, in any group of 40 people, some 4 of them have birthdays in the 
same month. 

7 (P. Erdos) Let S be an (n + l)-element subset of {1,2,..., 2«}. Prove that 
there exist two (different) integers in S, one of which exactly divides the other. 

8 Consider an equilateral triangle 2 units on a side. Prove that it is not possible to 
place five points in the interior of the triangle so that each of them is more than 
1 unit away from all the others. 

9 Consider the graphs G\ and G2 in Examples 5.1.3 and 5.1.4 

(a) Prove that the function / described in Example 5.1.6 is an isomorphism. 

(b) Explicitly describe the other three isomorphisms from G\ onto G^. 



than x. 



5.1. Exercises 

10 Find all pairs of isomorphic graphs from the following: 
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o. ^p 



o o 




O Q 



o 



(ii) 



x o p y 

(iii) 



A 



A A 



Q 



o^o 



A o. 



o o 



(iv) 




(v) 



11 Prove that two graphs on three vertices are isomorphic if and only if they have 
the same numbers of vertices and edges. 

12 Use Example 5.1.16 to show that two graphs on four vertices are isomorphic if 
and only if they have the same degree sequence. 

13 Let G — (V,E) be a graph having k odd vertices, i.e., k = o({v G V : d{v) 
is odd}). Prove that k is even. 

14 Let V = {1, 2, 3,4}. Find a set E so that G = (V, E) is illustrated by the picture 



O- 



-O 



O- 



-O 



O- 



-O 



(a) 



O 



(b) 



O 



(c) 



O- 



-O 



o- 



-o 



15 Among those graphs pictured in Fig. 5.1.5, find one isomorphic to 



Q 



(a) O- 



-O- 



-O- 



-O (b) 




16 Illustrate the complement of 
(a) 




O O; 
(c) 



O- 



o 



O O 
(d) 



■O 



o 



(b) 



17 Evidently (Example 5.1.18) it is possible for a graph to be isomorphic to its 
complement. 




O 

o 



346 



Enumeration in Graphs 



(a) Find a graph on five vertices that is ismorphic to its complement. 

(b) Prove that no graph on six vertices is isomorphic to its complement. 

(c) Prove that no graph on seven vertices is isomorphic to its complement. 

18 Prove that 

(a) (G c ) c = G. 

(b) G c is isomorphic to H c if and only if G is isomorphic to H. 

19 Let G = (V, E) be a graph with n vertices. For each v S V, let d c (v) be the 
degree of v in the graph G°. Explain why d(v) + d c (v) = n — 1. 

20 A graph with more than one component is said to be disconnected. 

(a) How many of the graphs in Fig. 5.1.5 are disconnected? 

(b) Show that the complement of a disconnected graph is connected. 

(c) Which graph(s) G on four vertices have the property that both G and G c 
are connected? 

(d) Illustrate a connected graph G whose complement is connected, but not 
isomorphic to G. 

(e) Illustrate two nonisomorphic, connected graphs that have the same degree 
sequence. 

21 Prove that the relation "isomorphic to" is an equivalence relation on the set of 
graphs. 

22 Two of the graphs in Fig. 5.1.5 are drawn in such a way that edges appear to 
cross, yet their would-be intersection is not a vertex of the graph. Redraw these 
two graphs in such a way that the segments (arcs) representing edges do not 
cross, i.e., do not meet except at vertices. 

23 Explain why no graph can have degree sequence 
(a) (3,3,3,3,3). (b) (5,4,2,2, 1). 

24 Let G be the graph illustrated in Fig. 5.1.7. Prove, by the numbers, that G is 
isomorphic to the graph whose vertices and edges are the 8 vertices and 12 
edges of a cube, respectively. 




Figure 5.1. 
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25 Illustrate six nonisomorphic graphs, each having five vertices and five edges. 

26 Prove, "by the numbers", that the graph illustrated in Fig. 5.1.8 is isomorphic 
to the Petersen graph (shown twice in Fig. 5.1.2). 




Figure 5.1.8 

27 A multigraph consists of two things, a nonempty set V, and a multiset E 
satisfying the property that every element of E is an element of V' 2 ' . So, a 
multigraph is like a graph except that more than one edge can be incident to 
the same pair of vertices. 

(a) Illustrate the multigraph M = (V,E), where V= {1,2,3,4} and E = 
{{1,2},{1,2},{1,4},{2,3},{2,3},{2,4},{3,4}}. 

(b) Show that Theorem 5.1.11 is valid for multigraphs. 

(c) Define "isomorphism" for multigraphs. 

28 Let G = (V,E) and H = (W,F) be graphs. Then H is a subgraph of G if 
W C V and F C E. Illustrate the seven nonisomorphic subgraphs of the 
complete graph A3. 

29 A planar graph is one that can be drawn in the plane in such a way that 
segments (arcs) representing edges do not meet except at vertices. 

(a) Show that Kj, and K4 are planar. 

(b) Is K 5 planar? 

(c) Show that the graph illustrated in Fig. 5.1.7 is planar. 



*5.2. EDGE COLORINGS AND RAMSEY THEORY 



Minds are like parachutes. They only function when they are open. 

— James Dewar 

Let's say that two people are acquainted if they have met before (whether they 
remember it or not). Strangers are people who are not acquainted. Would it surprise 
you to learn that, among the first six guests to arrive at a random Hollywood 
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Figure 5.2.1. The complete graph K 6 . 



cocktail party, there will always be three mutual acquaintances or three mutual 
strangers? To see why this is so, suppose Alan is the first guest to arrive. The 
next five guests fall into one of two categories according to whether they are 
acquainted with Alan or not, and one of these categories (pigeonholes) must contain 
(at least) three people. 

Suppose Bev, Connie, and Donna are all acquaintances of Alan. If the ladies are 
mutual strangers, we are finished. Otherwise, some two of them are acquainted. 
These two, together with Alan, comprise three mutual acquaintances. It may be, 
on the other hand, that none of the three ladies are acquainted with Alan. If they 
happen to be mutually acquainted, we are finished. Otherwise, some two of them 
are strangers and these two, together with Alan, comprise three mutual strangers. 

Let's transcribe our observation to graphs. Identify the six guests with the 
vertices of (Fig. 5.2.1). If guests X and Y are acquainted, color edge {X, Y} 
black. Otherwise, color it white. Our conclusion is that the resulting figure contains 
a black triangle, a white triangle, or both. 

What about n guests? Imagine a picture of K n = (V, V' 2 ') drawn using a black 
pen. Select a (possibly empty) subset E C V 1 - 2 " 1 and white-out the edges of K n that 
do not belong to E. The resulting black-white edge coloring of K n could easily be 
mistaken for an illustration of the graph G = (V,E). These two ways of looking at 
the same picture reveal a natural one-to-one correspondence between the 2 C '"' 2 ' dif- 
ferent black-white colorings of the edges of K n and the 2 C '"' 2 ' different graphs on n 
vertices. Exploiting this correspondence requires some new definitions. 

5.2.1 Definition. Let G = (V,E) and H = (W,F) be graphs. If W C V and 
F C E, then H is a subgraph of G. If F = E n W (2) , then H is the subgraph of G 
induced by W, written H = G[W]. 

If H = (W, F) is a subgraph of G — (V, E) then, because H is a graph, F C W {2 K 
Therefore, F C E n W {2 \ with equality if and only if H = G[W}. It follows that, 
H = (W,F) is a subgraph of G if and only if H is a subgraph of G[W], where 
W = V(H). In particular, G[W] = (W,EC\ W (2) ) is the unique maximal subgraph 
of G with vertex set W. 

A clique is a nonempty set of mutually adjacent vertices. So, a nonempty subset 
W of V(G) is a clique if and only if C E(G), if and only if G[W] = (W, W^) 
is a complete graph. An independent set is a nonempty set of mutually wowadjacent 
vertices. So, a nonempty subset W of V(G) is an independent set if and only if no 
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o o 

\ / 

o o 

Figure 5.2.2 

two of its vertices comprise an edge of G, if and only if G[W] = (W,0), if and 
only if W is a clique in G c . 

Consider some fixed black-white edge coloring of K 6 . Let G be the six- vertex 
subgraph whose edge set consists of the black-colored edges. Then a black triangle 
in K 6 is a clique in G, and a white triangle in K 6 is an independent set in G. This 
identification yields another way to state our party guest observation: Any graph G 
with six vertices contains a three-vertex clique or a three-vertex independent set, 
i.e., G contains an induced subgraph isomorphic to K3 or one isomorphic to ^3. 

It is a consequence of Theorem 5.2.3 (below) that, for any positive integers s and 
t, there exists an integer TV such that every graph on N vertices contains an induced 
subgraph isomorphic to ^ s or one isomorphic to K'j. If G is a graph on n > N 
vertices, then G has a total of C(n,N) induced subgraphs each having N vertices. 
If H is one of them then, since H has an induced subgraph isomorphic to ^ s or to 
Kf, so does G. We are led to the following: 

5.2.2 Definition. Let s and t be positive integers. The Ramsey number N(s, t) is 
the smallest value of n such that every graph on n vertices contains an induced sub- 
graph isomorphic to K s or an induced subgraph isomorphic to K^. 

Our cocktail party discussion proves that N(3,3) < 6. Because the pentagon 
graph (illustrated in Fig. 5.2.2) contains neither ^3 nor ^3 as an induced subgraph, 
N(3, 3) is not less than 6. Therefore, N(3, 3) = 6. 

It is not difficult to show that N(l, t) = 1 and N(2, t) = t for all t > 1. Moreover, 
the Ramsey numbers are symmetric, i.e., N(s, t) = N(t,s) for all s and t. The easy 
proofs of these elementary observations obscure the difficulty of obtaining exact 
values for Ramsey numbers in general.^ In fact, every known Ramsey number 
can be obtained by combining these elementary observations with the information 
contained in Fig. 5.2.3. 
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Figure 5.2.3. Ramsey Numbers N(s, t). 



*After Frank Ramsey (1902-1930). 

f See the lively and colorful article, "Ramsey Theory", by Ron Graham and loel Spencer, in the July 1990 
issue of Scientific American (pp 112-117). 
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Because exact values of Ramsey numbers are so hard to determine, there is a 
good deal of interest in bounding them. 

5.2.3 Theorem. s, t>2, then N(s,t) exists and N(s, t) < N(s, t - 1)+ 
N{s- l,f). 

The proof that every graph on N(s,t— 1) + N(s — l,f) vertices satisfies the 
Ramsey property for s and t is left to the exercises. 

5.2.4 Corollary. If s and t are positive integes, then N(s, t) < C(s + t — 2, 

Proof. The proof is by induction on k = s + t. It follows from the "elementary 
observations" that N(s, t) — C(s + t — 2, s — 1) if either s or t is at most 2. So, 
we may proceed under the assumptions that s, t > 3, and that the result is true 
for all values of k < s + 1. Together with Theorem 5.2.3, these assumptions 
yield 

N(s, t) < N(s, t-l)+N(s-l,t) 

<C(s + t-3,s-l) + C(s+t-3,s-2) 

= C(s + t-2,s- 1). ■ 

What about lower bounds? 

5.2.5 Theorem. Ramsey number N(s, t) > (s — — 1) + L 

Proof. Let n = (s — l)(t — 1). It suffices to exhibit a black-white coloring of the 
edges of K n in which there is no black K s and no white K,. Imagine the vertices of 
K n arranged in a rectangular array of s — 1 rows and t — 1 columns. If vertices u and 
v lie in the same row of the array, color edge {u, v} white. Otherwise, color it black. 
By the pigeonhole principle, in any collection of s vertices, some two of them must 
come from the same row. Hence, this coloring of K n can contain no black K s . If all 
the black edges are deleted, then the connected components of what's left cor- 
respond to the rows. Since each of these holds t — 1 vertices, K n can contain no 
white K t . ■ 

It follows from Corollary 5.2.4 and Theorem 5.2.5, e.g., that 10 > N(3,4) > 7. 
In fact (see Fig. 5.2.3), W(3,4) = 9. 

Let's move on to another application of the correspondence between the differ- 
ent graphs on n vertices and the different black-white edge colorings of K n . Denote 



*The application discussed from this point to the end of the section involves counting nonisomorphic 
graphs using the techniques developed in Sections 3.6 and 3.7. Readers who omitted those sections should 
either skip the remainder of Section 5.2 or just skim it for the flavor and conclusion. 
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by g(n,m) the number of nonisomorphic graphs having n vertices and m edges. 
Then 



C(n,2) 

fn(x) = S{n,m)x m 

m=0 

is a generating function for the nonisomorphic graphs on n vertices. 



(5.2) 



5.2.6 Example. The 11 nonisomorphic graphs on four vertices from Fig. 5.1.5 
have been reproduced in Fig. 5.2.4. Using these pictures, it is easy to see that 



f 4 {x) = 1 + x + 2x 2 + 3x 3 + 2x 4 + x 5 + x 6 . 
(Confirm that / 4 (1) = 11.) 



(5.3) 
□ 



Because K„ is the unique graph having n vertices and m = C(n, 2) edges, 
g(n, C(n,2)) — 1, i.e., f n {x) is a monic polynomial of degree C(n,2). Since G\ 
and G2 are isomorphic if and only if Cj and G\ are isomorphic, f„(x) is symmetrical 
in the sense that g(n, m) = g(n, C(n, 2) — m), 0 < m < C(n, 2). It follows that/„(x) 
is a reciprocal polynomial, i.e., 

x^Mx- 1 ) =/„(*). 

(Confirm that x%(x~ l ) =f A {x).) 

If we had a picture comparable to Fig. 5.2.4 for the 34 nonisomorphic graphs on 
five vertices, it would be a simple matter to produce 

f s (x) = x w +x 9 + 2x 8 + Ax 1 + 6x 6 + 6x 5 + 6x 4 + Ax i + 2x 2 + x + 1 . (5.4) 

On the other hand, if it were your assignment to produce such a picture, it would 
surely be useful to know, for eample, that the coefficient of x A in/s(x) is 6, i.e., that 
there are exactly six nonisomorphic graphs having five vertices and four edges. 
Okay, so how does one go about generating f$ (x) without a picture? 



352 



Enumeration in Graphs 



Let's begin by taking V = {1, 2, . . . , «}. Then Gi = {V,E X ) and G 2 = (V,£ 2 ) 
are isomorphic if and only if there is a permutation p : V — > V such that 

{z'j}e£i if and only if {/?(«'), p(j)} G £ 2 . (5.5) 

Recall (Definition 3.7.11) that the natural action of p G 5„ on V' 2 ' is denoted /?, 
where p : V* 2 ' — > V* 2 ' is defined by 

/K{U}) = {p(0,K/)}- (5.6) 

Expressed in terms of this induced action, Condition (5.5) becomes 

e££, if and only if p{e) G £ 2 . (5.7) 

In other words, Gi is isomorphic to G 2 if and only if there is a permutation p in the 
pair group S^ (see Definition 3.7.11) such that 

p(E 1 )=E 2 . (5.8) 

As a geometric object, the symmetry group of K„ = (V, V' 2 ') is S„, when it is 
expressed as permutations of V = {1,2, ... ,n}. As permutations of the edge set 
V^ 2 \ it is si 2 . Viewing G = (V,E) as a 2-coloring of the edges of K n , Condition 
(5.8) implies that two graphs on n vertices are isomorphic if and only if the 
corresponding 2-colorings of K„ are equivalent modulo S^ . This yields the 
following. 

12) 

5.2.7 Theorem. In terms of the cycle index polynomial for S„ , the generating 
function for the nonisomorphic graphs on n vertices is 

f n ( x ) = Wg>(l,x) 

= Z s ( 2 ,(l + x, 1 + x 2 , 1 + x 3 , . . . , 1 + x c{n V). 

(21 

Proof. If V = {1,2, ... ,n} then, modulo S n , the number of inequivalent black- 
white colorings of V^ 2 \ in which exactly m edges are colored black, is equal to 
g(n,m), the number of nonisomorphic graphs having n vertices and m edges. 
Thus, it remains to substitute w = 1 and b = x in the pattern inventory 
W s {2) (w, b) and use Polya's theorem. ■ 

5.2.8 Example. If n = 4 then, from Equation (3.60), 



Z s (2) (si , s 2 , ■ ■ ■ , s 6 ) = i (if + 9s\sl + 8sj + 6s 2 s 4 ) ■ 



(5.9) 
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The substitution s r = 1 + x r , r > 1, produces 
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/4W-25[(l+-^) 6 + 9(l+x) 2 (l+x 2 ) 2 + 8(l+x 3 ) 2 + 6(l+x 2 )(l+x 4 )] 
= i [(1 + 6x + 15x 2 + 20x 3 + I5x 4 + 6x 5 + x 6 ) 

+ 9(1 + 2x + 3x 2 + 4x 3 + 3x 4 + 2x 5 + x 6 ) 

+ 8(1 + 2x 3 + x 6 ) + 6(1 + x 2 + x 4 + x 6 )] 
= 1 + x + 2x 2 + 3x 3 + 2x A + x 5 + x 6 , 

which is exactly Equation (5.3). □ 



5.2.9 Example. From Example 3.7.16, the cycle index polynomial for S\ ' is 

z s ( 2 , ( Sl ,s 2 ,..., j w ) = K + io^ 4 4 + 20s i4 + 15s \4 

+ 30s 2 sl + 20sis 3 s 6 + 24sj] 

Let's use this formula (and Theorem 5.2.7) to compute g(5, 6), the coefficient of x 6 
in/ 5 (x). Because C(10,6) = 210, 

(l+x) 10 = l + --- + 210x 6 + ---+x 10 . 

Similarly, 

10(l+x) 4 (l+x 2 ) 3 

= 10(1 + 4x + 6x 2 + 4x 3 + x 4 )(l + 3x 2 + 3x 4 + x 6 ) 

= 10(1 + • • • + [(l)(x 6 ) + (6x 2 )(3x 4 ) + (x 4 )(3x 2 )] + • • • +x 10 ) 

= 10 + • • • + 220x 6 + • • • + 10x 10 . 

The coefficient of x 6 in 20(1 + x)(l + x 3 ) 3 is 20(1)(3) = 60. In 

15(l+x) 2 (l+x 2 ) 4 

= 15(1 + 2x + x 2 )(l + 4x 2 + 6x 4 + 4x 6 + x 8 ) 

= 15(1 + • • • + [(l)(4x 6 ) + (x 2 )(6x 4 )] + • • • +x 10 ) 

= 15 + • • • + 150x 6 + • • • + 15x 10 , 

it is 150. It is 30(1)(2) = 60 in 30(1 + x 2 )(l + x 4 ) 2 , 20 in 20(1 + x)(l + x 3 ) 
(1 + x 6 ), and 0 in 24(1 + x 5 ) 2 . Summing up, the coefficient of x 6 in/s(x) is 

^(210 + 220 + 60+ 150 + 60 + 20 + 0) =|| 

= 6. 

The g(5, 6) = 6 nonisomorphic graphs having five vertices and six edges are illus- 
trated in Fig. 5.2.5. The first few values of g(n, m) are tabulated in Fig. 5.2.6. □ 
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5.2.10 Corollary. The number of nonisomorphic graphs on n vertices is given 
by the formula 

C(b,2) 



!>(».») ^E 2 ^- 



m—l 



pes n 



Proof. The result follows from setting x = 1 in Theorem 5.2.7. 



5.2. EXERCISES 

1 Let H be an induced subgraph of G. If K is an induced subgraph of H, prove 
that K is an induced subgraph of G. 

2 Prove the "elementary observations" about Ramsey numbers, i.e., that 
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Figure 5.2.6 The number g(n,m) of graphs with n vertices and m edges. 
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(a) N(s, t) = N(t, s) for all s, t > 1. 

(b) N(l,t) = 1 for all t > 1. 

(c) N(2,t) = t for all t > 1. 

3 Prove from scratch (i.e., without using Theorem 5.2.3) that 

(a) #(3,4) < 10. 

(b) #(4,4) < 20. 

4 Prove Theorem 5.2.3. (Hint: Exercise 3.) 

5 How many nonisomorphic graphs are there 
(a) on six vertices? (b) on seven vertices? 

6 Explain how the graph in Fig. 5.2.7 proves that Ramsey number #(3,4) > 8. 




7 Of the six graphs in Fig. 5.2.5, only G\ and G(, share the same degree 
sequence. Prove that G\ and Ge are not isomorphic 

(a) by counting components in their complements. 

(b) by an argument based on the fact that the two vertices of degree 3 are 
adjacent in G(, but not adjacent in G\. 

8 Illustrate the nonisomorphic graphs having five vertices and four edges. 

9 Compute fi(x) 

(a) from an illustration of the nonisomorphic graphs on three vertices. 

(b) using Theorem 5.2.7. 

10 Illustrate the nonisomorphic graphs having five vertices and 
(a) seven edges. (b) three edges. 

11 Suppose n > 4. Independently of Theorem 5.2.7, give an intuitive explanation 
why there should be exactly two nonisomorphic graphs having n vertices and 
two edges. 

12 Independently of Theorem 5.2.7, give an intuitive explanation why there 
should be exactly two nonisomorphic graphs having 
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(a) five vertices and eight edges. 

(b) six vertices and 13 edges. 

13 Compute g(6, 3) 

(a) without appealing to Theorem 5.2.7. 

(b) using Theorem 5.2.7 in the manner of Example 5.2.9. 

14 Prove that g(n, 3) = g(6, 3) for all n > 6. 

15 Verify the value for g(6, m) tabulated in Fig. 5.2.6 when 
(a) m = 4. (b) m = 5. (c) m = 6. (d) m = 1. 

16 Prove that g(n, m) = g(2m, m) for all n > 2m. 

17 Illustrate the nine nonisomorphic graphs having six vertices and 
(a) 4 edges. (b) 1 1 edges. 

18 How many of the nonisomorphic graphs on five vertices are connected? 

19 Prove that the graphs illustrated in Fig. 5.2.8 are not isomorphic. 




Figure 5.2.8 



20 Let r„ be the set of all 2 C '"' 2 ' graphs on n vertices, and let T n i be the subset of 
r„ consisting of those graphs that contain a A:-vertex clique. 

(a) Prove that o(T n , k ) < C(n,k)2 c ^- C ^ 2 l 

(b) Prove that o(T^)/o(T n ) < n k /[k\2 c ^]. 

(c) Prove that o(T nA )/o(T n ) < \ when n < 2 k l 2 . 

(d) Prove Erdos's theorem: N{k,k) > 2 k l 2 . 

21 A proper coloring of the edges of (an arbitrary graph) G is one in which 
adjacent edges are colored differently. The edge chromatic number k(G) is the 
smallest number of colors that suffice to properly color the edges of G. 
Evidently, k(G) > d\, the largest vertex degree in G. In 1964, Russian 
mathematician V. G. Vizing proved that k(G) < d\ + 1. 

(a) Prove that k(G) = d\ for every connected graph G on four vertices. 

(b) If G = K 3 , prove that k(G) =d x + \. 

(c) Exhibit a connected graph G ^ for which k(G) = d\ + 1. 
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5.3. CHROMATIC POLYNOMIALS 

The intellect of man is forced to choose. 

— William Butler Yeats 

In Section 5.2, we discussed edge colorings of the complete graph K n . In this 
section, we are interested in coloring vertices, not only of K n , but of any graph. 
An r-coloring of G is a function from V(G) into some set of r colors. 

5.3.7 Definition. A proper coloring of G is one in which adjacent vertices are 
colored differently. The number of proper r-colorings of G is denoted p(G, r). 

5.3.2 Example. If G = K^, then the criterion that adjacent vertices be colored 
differently is no restriction at all: 

p{K,r)=r». □ 

5.3.3 Example. Since every vertex of the complete graph is adjacent to every 
other vertex, the only proper colorings of K n are those for which all the vertices 
are colored differently. By the fundamental counting principle, 

p(K n ,r) = r{r-\){r-2)---{r-n+\) 
- M 

the falling factorial function. In particular (Equations (2.33) and (2.34)), 

p(K n , r) = r n ~ s(n, n - l)^" 1 + s(n, n - 2)r"- 2 + (-l)"" 1 ^, l)r, 

where s(n, k) is a Stirling number of the first kind, 1 < k < n. □ 

These examples turn out to be typical in the sense that, for any graph G on n 
vertices, p(G, r) is a monic polynomial of degree n in r. One way to establish 
this fact makes use of a recursive algorithm for computing "chromatic polyno- 
mials". 

5.3.4 Definition. Suppose e = {u, v} is an edge of G = (V,E). The edge sub- 
graph G — e = (V, E \ {e}) is the graph obtained from G by deleting edge e. 

Let G be the graph illustrated in Fig. 5.3.1a, with e = {«, v}. Then G — e is 
pictured in Fig. 5.3. li>. 

Note that every proper coloring of G is a proper coloring of G — e. The differ- 
ence p(G — e, r) — p(G, r) is the number of proper colorings of G — e in which u 
and v are colored the same. To evaluate this difference, consider the multigraph 
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obtained from G — e by identifying vertices u and v, i.e., by coalescing u and v into 
a single vertex. This multigraph is illustrated in Fig. 5.3.2a. Observe that there is a 
one-to-one correspondence between proper colorings of the multigraph and those 
colorings of G — e in which u and v are colored the same. 

From the perspective of proper (vertex) colorings, extra edges are immaterial. 
There is a one-to-one correspondence between proper colorings of the multigraph 
in Fig. 5.3.2a and proper colorings of its underlying graph G/e, pictured in 
Fig. 53.2b. In particular, the difference p(G — e, r) — p(G, r) = p(G/e, r). Re- 
arranging terms in this equation proves the following fundamental result. 

5.3.5 Theorem (Chromatic Reduction). Let G be a graph. If e = {u, v} is an 

edge of G, then 

p(G,r)=p(G-e,r)-p(G/e,r), (5.10) 

where G — e is the graph obtained from G by deleting edge e, and G/e is the graph 
obtained from G — e by identifying vertices u and v, and deleting any multiple 
edges that may arise in the process. 

5.3.6 Example. Let's use chromatic reduction to work out p(G, r) for the graph 
shown in Fig. 5.3.1a. With respect to the edge e = {u, v}, Equation (5.10) may be 
written in the picturesque form 




(5.11) 



In Equation (5.11), a picture of H has been used to represent p(H,r). Another 
picturesque application of Theorem 5.3.5 yields 
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After consolidating isomorphic graphs, this last equation becomes 




0 O 



o o 



o 



0 

+ 2 

o 0 



Another step (consolidation included) produces 




O O O o 

- 4 +5 
O O O O o 



2 O 



= p(K c 4 , r) - 4p(K c 3 , r) + 5p(K c 2 , r) - 1p(K\, r). 
Because p(K^, r) = r", this last equation is equivalent to 
p(G, r)=r 4 - 4r 3 + 5r 2 - 2r. 



(5.12) 
□ 



If G is any graph with n vertices and m edges then, after m steps, chromatic 
reduction results in an expression of the form 

p(G, r) - p{Kl r) - M^-i , r) + b 2 p(K c n _ 2 , r) - ■ ■ ■ 
= r" -b x f- x +b 2 f Jl - 2 , 

where b\,bi, ■ ■ ■ are integers. This proves the following: 

5.3.7 Corollary. Let G be a graph on n vertices. Then p(G, r) is a monic poly- 
nomial of degree n in the variable r. 
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Now that we know p(G, r) is a polynomial, we may as well replace r with a more 
customary variable. 

5.3.8 Definition. The chromatic polynomial* of G is 

p{G,x) = x"- b x x n - 1 + b 2 i"- 2 + {-\f- x b n ^x. 

From Equation (5.12), f(x) — p(G,x) — x 4 — 4x 3 + 5x 2 — 2x is the chromatic 
polynomial of 



(5.13) 



meaning that/(r) is the number of proper colorings of G using (at most) r colors. 
Because it contains a three-vertex clique, G cannot be properly colored with fewer 
than three colors. Therefore, /(0) =/(l) =/(2) = 0, which implies that x(x — 1) 
(x — 2) is a factor of/(x). Indeed, 

f(x) = p(G,x) 

2 (5.14) 

= x(x — 1) (x—2). 

An important open problem in graph theory is to determine when a given poly- 
nomial is the chromatic polynomial of some graph. Consider, for example, p(x) = 
x(x — l)(x — 3) 2 . If p(x) — p(G,x) for some graph G then, because p(3) = 0, G 
could not be properly colored with three (or fewer!) colors. But, p(2) = 2 > 0 
implies that G is properly 2-colorable! This contradiction proves that p(x) is not 
the chromatic polynomial of any graph. It also suggests something more. For any 
graph G, there is some minimum positive integer k (depending on G) such that 
p(G, r) = 0 whenever r < k, but p(G, r) > 0 for every integer r > k. 

5.3.9 Definition. The chromatic number %(G) is the minimum number of colors 
that suffice to color G properly. 1 ^ 

The chromatic number of the graph in Equation (5.13) is 3, the first positive 
integer that is not a root of its chromatic polynomial (Equation (5.14)). 

5.3.10 Definition. If G\ = (V\,E\) and G2 = (^,£2) are graphs on disjoint 
sets of vertices, their union is the graph G\ + G2 = (V\ U Vi,E\ U £2)- 




*The chromatic polynomial of a planar graph was introduced in 1 9 1 2 by G. Birkhoff as part of his effort to 

prove the four-color theorem. 

f Computing %{G) is an NP-complete problem. 
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If G\ and G 2 are connected, then G\ + G2 is a graph with two components, one 
isomorphic to G\ and the other isomorphic to G 2 . 

5.3.11 Theorem. lfG\ = (V\ , E\) and G2 — (V 2 ,E 2 ) are graphs on disjoint sets 
of vertices, then 

p(Gi + G 2 ,x) = p{G u x)p{G 2 ,x). 

Proof. The result is an immediate consequence of the definition of p(G, r) and the 
fundamental counting principle. ■ 

If G is not connected then, from Theorem 5.3.11, 

X(G) = max x(C), 

where the maximum is over the components C of G. 

Since every graph has at least one vertex, no graph can be properly colored with 
zero colors. The only graphs that can be properly colored with just one color are the 
graphs with no edges. Thus, %(G) > 2 for any graph with an edge. 

5.3.12 Definition. If %(G) < 2, then G is bipartite*. 

Suppose G is a bipartite graph with at least one edge. Consider some proper 
blue-green coloring of G. Let V b and V g be the vertices of G that are colored 
blue and green, respectively. Then V(G) = Vb U V g , is the disjoint union of two 
parts such that every edge of G has one vertex in each part. Conversely, if V(G) 
is the disjoint union of two independent sets of vertices, then G can be properly 
2-colored. This explains the name "bipartite". (There may be more than one 
way to bipartition the vertex set of a bipartite graph.) 

5.3.13 Definition. Let s and t be positive integers. Suppose X and Y are disjoint 
sets of s and t elements, respectively. Let V = X U Y. Then the complete bipartite 
graph K s> , = (V,E), where E — {{x 7 y} : x G X and y G Y}. 

The complete bipartite graph K 2 ^ is illustrated in Fig. 5.3.3. Observe that K 2 ^ is 
"maximally bipartite" in the sense that x(G) = 3 for any graph G that can be 
obtained from K 2 ^ by adding an edge. 

5.3.14 Definition. If G\ and G 2 are graphs on disjoint sets of vertices, their join 
G\ V G 2 — (Gj + G\Y is the graph obtained from G\ + G 2 by adding new edges 
from each vertex of G\ to every vertex of G 2 . 

*In chemical applications of graph theory, bipartite graphs correspond to so-called alternant 
hydrocarbons. 
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Figure 5.3.3 



Observe that the complete bipartite graph K s t = V K e t . 
More important than complete bipartite graphs are the "trees". 

5.3.15 Definition. Suppose k > 3. A cycle in G of length A: is a sequence of 
distinct vertices (vi, V2, . . . , v*) such that {vi, V2}, {V2, V3}, . . . , {vh, v*}, and 
{vt,vi} are all edges of G. A tree is a connected graph that does not have any 
cycles. 

The nonisomorphic trees on six vertices are illustrated in Fig. 5.3.4. 



5.3.16 Theorem. If T is a tree on n vertices, then p(T,x) — x(x — 1) 
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Figure 5.3.4 
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The first step of Example 5.3.6 resulted in the picturesque equation 




Because the graphs on the right-hand side of this equation are both trees, it follows 
from Theorem 5.3.16 that 

p(G,x) — x(x — l) 3 — x(x — l) 2 

= x(x~ l) 2 [(x~ 1) - 1] (5.15) 
= x(x- l) 2 (x-2), 

confirming Equation (5.14) for the graph G of Equation (5.13). 
The following will be useful in the proof of Theorem 5.3.16. 

5.3.17 Lemma. Let T be a tree on n > 1 vertices. Then T has (at least) 
two vertices of degree 1. 

Proof. Among all the paths in T there is one of greatest length, say from vertex u 
to vertex v. If either u or v had degree greater than 1 then, because there are no 
cycles in T, the path from u to v could be extended. ■ 

Proof of Theorem 5.3.16. The proof is by induction on n. If n = 1, then 
p(T,x) = x and the proof is complete. So, suppose n > 1. Let u be a vertex of T 
of degree 1 and let e be the unique edge incident with u. Then T — e is a discon- 
nected graph having two components, one the isolated vertex u and the other iso- 
morphic to the tree T/e. By Theorem 5.3.11, p(T — e 7 x) — xp(T/e,x). Hence, by 
chromatic reduction (Theorem 5.3.5), 

p{T,x) =p(T-e,x) -p{T/e,x) 
= xp(T/e,x) -p(T/e,x) 
= (x- l)p(T/e,x). 



Because T/e is a tree on n — 1 vertices, the induction hypothesis gives 
p(T/e,x) — x(x — 1)"~ 2 , and the proof is complete. ■ 

5.3.18 Corollary. // T is a tree on n > 1 vertices then yfT) = 2. So, every tree 
is a bipartite graph. 
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Proof. While the corollary is an immediate consequence of Theorem 5.3.16, a 
direct proof affords some additional insight. Let u be a fixed but arbitrary vertex 
of T. Let v be some other vertex. Since all trees are connected, there is a path in 
T from u to v. Indeed, this path must be unique. Otherwise, there would be a cycle 
in T. Define the distance (in T) from u to v to be the length of this unique path. 
Color vertex u blue. Color vertex v blue if the distance from u to v is even, and color 
it green if the distance is odd. 

If this scheme results in adjacent vertices v\ and V2 being colored the same, then 
the path from u that determines the color of V2 could not pass through vi. But, that 
means there are two paths from u to v 2 , one that passes through vi, and one that does 
not. Hence, v\ and V2 lie on a cycle of T, contradicting the definition of a tree. 

■ 

The notion of distance used in the proof of Corollary 5.3.18 can be extended. 

5.3.19 Definition. Let G = (V,E) be a connected graph. Suppose u,w S V. If 
u = w, the distance d(u, w) = 0. If u ^ w, then d(u, w) is the length of a shortest 
path in G from u to w. The diameter of G is 



Using this notion of distance, the parity proof of Corollary 5.3.18 can be 
extended to obtain the following characterization of bipartite graphs. 

5.3.20 Theorem. Let G be a graph. Then G is bipartite if and only if it contains 
no cycles of odd length. 

Proof. It is easy to see that a cycle of odd length cannot be colored using two 
colors. Conversely, because of Theorem 5.3.11, it suffices to prove the theorem 
when G is connected. Let u be a fixed but arbitrary vertex of G. Color u blue. If 
v G V(G), color v blue if d(u, v) is even and color it green if d(u, v) is odd. Because 
G has no cycles of odd length, the result is a proper 2-coloring. ■ 

We now return to the general study of chromatic polynomials. 

5.3.21 Definition. Let G\ = (V,E) and G2 = (W,F) be graphs on disjoint sets 
of vertices. Suppose {u\,u 2 , ■ ■ ■ ,u,} and {w\, W2, • ■ • , w,} induce (r-vertex) cliques 
in G\ and G2, respectively. Let G be the graph obtained from G\ + G2 by identify- 
ing Ui with Wi, 1 < i < t. Then G is an overlap of G\ and G2 in K t . 

5.3.22 Example. Graphs G and H in Figure 5.3.5 are two (nonisomorphic) 
overlaps of G\ and G2 in K 4 . They can also be viewed as overlaps of G\ and K3 



5.3.23 Theorem. If G is an overlap of Gi and G2 in K t , then p{G,x) = 



u,w£V 



max d(u, w). 



in K2- 



□ 



p{G u x)p{G : 



; 2,*)/*«. 
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Figure 5.3.5 

Proof. If r colors are available, the vertices of the overlapping clique can be 
colored properly in A'* 1 ways. Evidently, the remaining vertices of G\ can then be 
colored properly in p(Gi,r)/A'> ways. Similarly (and independently), the remain- 
ing vertices of G2 can then be colored properly in p(G2,r)/A'> ways. So, by the 
fundamental counting principle, 



The result now follows from the fact that the polynomial identity A''p(G,r) 
p{G\,r)p{G2, r) holds for infinitely many positive integers r. \ 



5.3.24 Example. The graph G illustrated in Fig. 5.3.6 is an overlap of 



H 




and H' = 




in K 4 . Because H and H' are isomorphic, they have the same chromatic polynomial. 
Therefore, from Theorem 5.3.23, 



p(G,x) 



p(H,x) z 



(5.16) 



Because H is the overlap of £3 and K A in K 2 , p(H,x) = x^x^ /x^ = (x - 2)x ( - 4 l 
Substituting this into Equation (5.16) yields 



p(G,x) = 



(x - 2) 2 x^x^ 
xW 

x{x- l)(x-2) 3 (x-3). 



□ 




Figure 5.3.6 
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EXERCISES 



1 



Compute the chromatic polynomial of 



(d) 



(a) 




(e) 



(b) 



w 




(c) 



o- 




o 



2 Compute the chromatic polynomials for the 1 1 nonisomorphic graphs on four 
vertices. 

3 Let G be the wheel illustrated in Fig. 5.3.7. Compute the 

(a) chromatic number of G. 

(b) chromatic polynomial of G. 



4 The coefficients of p(G,x) are known to alternate in sign. (See Exercise 28, 
below.) Confirm this fact 

(a) when G — K n . (b) when G is a tree. 

5 Among the most famous open problems for chromatic polynomials is the 
following conjecture of R. C. Read: If p{G 1 x) = x" — bxx 11 ^ 1 + bix n ~ 2 — ■ ■ -, 
then the sequence b\, £2, • • • is unimodal, i.e., there is an integer k, depending 
on G, such that b\ < bi < ■ ■ ■ < bi and b^ > bk+\ >■■■ . Confirm Read's 
conjecture 

(a) if G is a tree. (b) for p(K n ,x), 3 < n < 8. 

6 In modern telecasts of National Football League games, one frequently has an 
opportunity to examine important plays from "the reverse angle". Let's look 
at chromatic reduction from the reverse angle, i.e., expressed in the 




Figure 5.3.7 
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form p(H,x) = p(H + e,x) + p(H/e,x), where H + e is obtained from H 
by adding in a new edge e — {u,v} that was not there before, and H/e is 
obtained from H by identifying vertices u and v (and deleting superfluous 
edges). 

(a) Show that the followign picturesque example of this reverse-angle 
approach produces the same answer as Example 5.3.6: 




(b) If G is a graph on n vertices, prove that p(G,x) is a linear combination of 
the falling factorial functions x^ k \ k < n, with nonnegative integer 
coefficients. 

7 Use the reverse-angle technique of Exercise 6 to compute the chromatic 
polynomial of 




8 Prove that x 2 is a factor of p(G, x) whenever G is disconnected. (The converse 
turns out to be true as well.) 

9 Denote by C n the graph with n vertices, n edges, and a single cycle of length n. 
Then C3 = ^3, C4 is the square, C5 is the pentagon, etc. 

(a) Draw suitable pictures, using dark and light vertices, to show that C4 and 
C(, are bipartite. 

(b) Use the chromatic polynomials of C4 and to prove that they are 
bipartite. 

(c) Prove that p{C n ,x) = {x - 1)" + {-\f{x - 1). 

(d) Use part (c) to prove that C„ is bipartite if and only if n is even. 

10 The path P n is the unique tree on n vertices with diameter n — 1. The clique 
number co(G) is the maximum value of t such that K t is an induced subgraph of 
G. Evidently, x(G) > co(G). Curiously enough, if G does not contain an 
induced subgraph isomorphic to P4, then x(G) = co(G). 
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(a) Show that x(C 5 ) > co(C 5 ). (Hint: Exercise 9.) 

(b) Show that %(P 4 ) = (o(P 4 ). 

11 Consider the graphs 




(a) Explain how G might be viewed as an "overlap of two copies of C 4 in P 3 ." 

(b) Without computing p(G,x), show that it could not possible equal 

f(x) = p(C 4 ,x) 2 /p(P 3 ,x). 

12 In 1941, R. L. Brooks proved that if G is neither an odd cycle nor a complete 
graph, then %(G) < d\, the largest vertex degree of G. Confirm that the 

(a) inequality fails for C 5 . (See Exercise 9.) 

(b) inequality fails for K 4 . 

(c) theorem is valid for C 4 . 

(d) theorem is valid for any tree on n > 3 vertices. 

13 Let G be a graph with n vertices, m edges, and chromatic polynomial 
p(G,x) = x" — b\x"~ l + • • • . Prove that b\ = m. 

14 Let Gi and G2 be graphs on disjoint sets of m and M2 vertices, respectively. A 
coalescence of G\ and G2 is any graph on n\ + n-i — 1 vertices that can be 
obtained from G\ + G2 by identifying (coalescing into a single vertex) some 
vertex of G\ with any vertex of G2. Let G\ * G2 be one of the «i«2 different 
coalescences of G\ and G2. 

(a) Prove that p(G\ * G 2 ,x) — p(G 1 ,x)p(G 2 ,x)/x. 

(b) Without actually computing them, prove that the chromatic polynomials 
of the three graphs in Fig. 5.3.8 are all the same. 




Figure 5.3.8 
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15 Prove that the chromatic polynomials of the four graphs in Fig. 5.3.9 are all the 
same. 




Figure 5.3.9 



16 Suppose /(x) and g(x) are defined in terms of falling factorial functions by 

= and g{x) =Y^b j x { i\ 

i=0 j=0 

Define the join-product of f(x) and g(x) by 

f{x)Vg{x) = f^[^a t b k -M k \ 

k=0 \ t=0 J 

Then, e.g., (x< 3 > + x< 2 ') V (x< 4 ' + 2x< 3 ' + x< 2 ') = x< 7 ' + 3x< 6 ' + 3x< 5 ' + x< 4 '. So, 
the join-product of linear combinations of falling factorial functions x^ 
behaves like an ordinary product of linear combinations of ordinary powers 
of x. It turns out that the chromatic polynomial of a join of two graphs is just 
the join-product of their chromatic polynomial, i.e., p(G\ V Gi,x) = 
p(Gi,x) Vp(G2,x). This is, of course, a useful observation only if p(Gi,x) 
and p(G2,x) are expressed in terms of falling factorial functions, as in 
Exercise 6(b). 

(a) Use the join-product approach to show that p(K\^ V Ci,x) — x{x — l)x 
(x - 2)(x - 3)(x 3 - 12x 2 + 5(k-71). (Hint: The complete bipartite 
graph K\ 2 is a tree on three vertices, and C4 is a square.) 

(b) Prove the formula p(G\ V G2,x) = p(G\,x) V p(G2,x). (Hint: Use the 
reverse-angle approach of Exercise 6 on the part of G\ V G2 that used to 
be G 2 ; note that K r V K s = K r+S .) 
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17 Use the join-product formula of Exercise 16 to express the chromatic poly- 
nomial of the following graph as a linear combination of falling factorial functions: 

(a) #1,3. (b) £ 2 ,3- (c) K X3 . (d) tf 4 ,3- 

18 Compute the chromatic polynomial of 




19 Let G be a graph. Prove or disprove that 

(a) all roots of p(G,x) are real. 

(b) all positive roots of p(G,x) are integers. 

(c) all real roots of p(G,x) are positive. 

20 Suppose T = (V,E) is a tree on n vertices. Prove that T has n — 1 edges. 

21 Prove that f(x) = x 6 - I2x 5 + 54x 4 - 1 12x 3 + I05x 2 - 36x is not the chro- 
matic polynomial of a graph. 

22 Let G — (V, E) be a graph with n vertices and m edges. Suppose e = 
{u, v} £ E. To subdivide e means, informally, to put a new vertex in the 
middle of e. Of course, adding a vertex changes the graph. Let H = (W,F) 
be the new graph. Then W = V U {w}, where w ^ V; and F = (E \ {e})U 
{{u, w}, {w, v}} is the set obtained from E by replacing {u, v} with new edges 
{m,w} and {w,v}. (Note, e.g., that dn(w) = 2.) If every edge of G is 
subdivided, the resulting graph S(G) has n + m vertices and 2m edges. Prove, 
for any graph G, that S(G) is bipartite. 

23 Let f„ be the number of nonisomorphic trees on n vertices. 

(a) Prove that f 4 = 2. 

(b) Illustrate three nonisomorphic trees on five vertices, explaining how you 
can be sure that they are nonisomorphic. 

(c) Illustrate the t-j — 11 nonisomorphic trees on seven vertices. 

24 A cycle of length n in a graph on n vertices is called a Hamiltonian cycle. A 
graph is Hamiltonian if it has a Hamiltonian cycle. 

(a) Illustrate the three nonisomorphic Hamiltonian graphs on four vertices. 

(b) Illustrate the two nonisomorphic Hamiltonian graphs having five vertices 
and no more than six edges. 

(c) Illustrate the two nonisomorphic Hamiltonian graphs having five vertices 
and seven edges. 
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(d) Prove that the existence of a Hamiltonian cycle is an invariant. 

(e) Find two Hamiltonian cycles in ^5 that, between them, contain all 10 
edges of ^5. 

(f) Find three Hamiltonian cycles in Kj that, between them, contain all 21 
edges of Kj. 

25 In how many ways can the faces of a cube be colored, using r colors, so that 
any two faces that share an edge are colored differently? 

26 If p(G,x) — x" — bix"~ l + • • • + {—\) n ~ l b n -\x, then G is both connected and 
bipartite if and only if b n -\ is odd. Use this criterion to prove that 

(a) every tree is bipartite. 

(b) C n is bipartite if and only if n is even. (Hint: Exercise 9(c).) 

27 Consider the following recursive construction of a family of graphs called 
2-trees: (1) The smallest 2-tree is K2; (2) if e = {u, v} is an edge of a 2-tree G, 
on n vertices, then the graph obtained from G by adding a new vertex w and 
two new edges {u, w} and {v, w} is a 2-tree on n + 1 vertices. (Up to 
isomorphism, ^3 is the only 2-tree on three vertices, and K4 — e is the unique 
2-tree on four vertices.) 

(a) Find the two nonisomorphic 2-trees on five vertices. 

(b) Find the five nonisomorphic 2-trees on six vertices. 

(c) If G is a 2-tree on n vertices, prove that its chromatic polynomial is 
p(G,x) — x(x — l)(x — 2)"~ 2 . (E. G. Whitehead has proved the converse, 
i.e., if p(G,x) = x(x — l)(x — 2)"~ 2 , then G is a 2-tree.) 

28 Let G be a graph with n vertices and c connected components. Prove that 

(a) p(G,x)=x n -b 1 x n - l +b 2 x n - 2 + {-\) n ~ c b n _ c xF, i.e., prove that 

bk — 0 for all k > n — c. 

(b) b\, b2, ■ ■ ■ , b n - c are positive integers, i.e., the coefficients of p(G,x) 
alternate in sign. (Hint: Induction on the number of edges.) 

29 Prove that p(G, t) = 0 for all t e (0, 1). 

30 Let G = (V, E) be a connected graph. If m, v,w € V, show that the distance 
from u to w satisfies 

(a) d{u, w) is a nonnegative integer. 

(b) d(u, w) > 0, with equality if and only if u = w. 

(c) d(u, w) — d(w, u). 

(A) d(u, w) < d(u, v) + d(v } w). 

31 Let s > 2 be an integer. Suppose T is a fixed but arbitrary three on t > 2 
vertices. Let be the smallest integer such that any graph G on N vertices 
contains an s-vertex clique or a subgraph isomorphic to T. 



372 



Enumeration in Graphs 



(a) Prove that N > (s — l)(t - 1) + 1. 

(b) Prove that N < (s - l)(t - 1) + 1. 

32 Let G = (V, E) be a graph with vertex set V — {v\ , V2 ■ ■ ■ , v n }. Suppose the set 
of colors is C = {xi,X2, . . . ,x r }. The Stanley polynomial Sf(G,r) = 
x f{v,) x f{v 2 ) ' ' ' x f(v r )> where the sum is over all proper colorings / : V — > C. 

(a) Show that 6^(P^,3) = M[ 2j i](xi,X2,*3) + 6M^{x\,X2,x^), where P3 is 
the unique three- vertex tree. 

(b) Show that substituting x\ = X2 = ■ ■ ■ = x r = 1 in ff{G, r) produces 
p(G,r). 

*5.4. PLANAR GRAPHS 

What you call Solid things are really superficial; what you call Space is really nothing 
but a great Plane. 

— The Stranger (from E. A. Abbott's Flatland) 

As we have seen, illustrating graphs by points and lines can be misleading. An arc 
representing an edge consists of infinitely many geometric points but only two ver- 
tics. In depictions of graphs, it is not unusual for arcs representing nonadjacent 
edges to cross. While the edges, themselves, do not intersect, their representing 
arcs do. This raises the question of whether it is possible to draw pictures of graphs 
with no edge crossings. Evidently (see Fig. 5.4.1), it is possible to draw K 4 with no 
edge crossings, but what about ^5? 

Provided there is enough space, it is always possible to draw a graph, any graph, 
without edge crossings. Represent the n vertices of G by the points 1,2, ... ,n along 
the x-axis in three-dimensional Euclidean space. Take m different planes that 
intersect in the x-axis, and draw one edge of G in each of them. 

What about two-dimensional space? Which graphs can be drawn in the plane 
with no edge crossings? This is a much more interesting question, not because 
the answer has any great significance, but because the search for answers has led 
to some good mathematics. 




K 4 

Figure 5.4.1 



5.4. Planar Graphs 



373 



5.4.1 Definition. A graph is planar if it can be illustrated in the plane in such a 
way that arcs representing edges do not meet except in points representing vertices. 

Less formally, G is planar if it can be drawn in the plane with no edge crossings. 
We will refer to such a drawing as a plane graph. So, the phrase "plane graph" 
means a specific plane illustration of some (necessarily planar) graph. 

Any discussion of plane graphs leads, sooner or later, to the notion of a 
"region". Imagine a plane graph as if it were a network of fences viewed from 
above. The vertices of the graph correspond to posts and its edges to fencing. 
From this perspective, a typical plane graph divides two-dimensional space into 
pastures, or regions, all but one of which is bounded. It is natural to wonder 
how the number r of regions might vary among different plane illustrations of 
the same planar graph G. Somewhat surprisingly, r = r(G) is the same for all plane 
representations of G. 

5.4.2 Theorem (Euler's Formula). If G is a plane graph with n vertices, m 
edges, c components, and r regions, then r=c + m — n+\. 

Proof. The proof is by induction on m. If m = 0, then G = K c n is a disconnected 
graph consisting of c = n components each of which is an isolated vertex. In this 
case, c + m — n+l = n + 0 — n+l = l. Since there is just one (unbounded) 
region, the m = 0 case is established. 

Assume the theorem is true for every plane graph having k > 0 edges. Let G be a 
plane graph with k + 1 edges, and suppose e is one of them. Now, it may happen 
that e is part of the boundary separating two different regions. If so, then e lies on a 
cycle of G, in which case G — e is a plane graph having the same numbers of ver- 
tices and components as G, but one fewer edge and one fewer region. Applying the 
induction hypothesis to G — e, we obtain r — l = c+(m— 1)— n+ 1, and the 
proof is finished. 

If the same region lies on both sides of e, then G — e is a plane graph having the 
same numbers of vertices and regions as G, but one fewer edge and one more com- 
ponent. Applying the induction hypothesis to G — e produces r = (c + l)+ 
[m— 1)— n+\ = c + m — n+l. ■ 

In the special case that G is a connected plane graph, Euler's formula is equiva- 
lent to 

r + n = m + 2. (5.17) 

The Flemish cartographer Gerhard Mercator (1512-1594) is generally credited 
with inventing the technique of map making in which the meridians (lines of long- 
itude) are drawn parallel to each other; perpendicular to these, the parallels of 

The regions might also be described as the connected components of what is left of the plane after the 
drawing of the graph has been etched away, i.e., the components of the complement of the plane graph. 
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Figure 5.4.2. Plane map of a cube. 



latitude are represented by straight lines whose distance from each other increases 
with the distance from the equator. Regardless of the exact details, a Mercator pro- 
jection produces a plane map of the spherical Earth. The same sort of thing can be 
done with any convex polyhedron. Figure 5.4.2 illustrates a plane map of a cube. 
Note that, just as Greenland appears comparable in size to South America on a typi- 
cal plane map of the world, our plane map of the cube distorts the square faces. 
Indeed, one of the six faces actually becomes unbounded. 

In a similar way, any convex polyhedron can be represented by a plane graph in 
which the vertices, edges, and faces of the polyhedron correspond, respectively, to 
the vertices, edges, and regions of the graph. It follows from Equation (5.17) that 
there is a relationship between the numbers F of faces, V of vertices, and E of edges 
of any convex polyhedron, namely, 



5.4.3 Corollary. Let G be a planar graph with m edges and n vertices. Then 



Proof. If G is a plane graph, it may happen that some nonadjacent pair of vertices 
of G can be joined by a new edge e that does not cross any of the existing edges of 
G, i.e., maybe G + e is still a plane graph. Assume that a maximum of k such edges 
can be added to G. Call the resulting plane graph H. Then H has n vertices and 
m + k edges. The proof will be completed by showing that m + k = 3n — 6. 

Clearly, H is connected, otherwise more edges could be added without destroy- 
ing planarity. If the cycle bounding some region of H contained four or more edges, 
then another edge could be added to H. Thus, the boundary cycles of the regions of 
H all have length 3. Let r(H) be the number of regions of H. Then, counting the 
edges that bound each region, we obtain the formula 2(m + k) = 3r(H) . Substitut- 
ing in Euler's formula (applied to H) yields |(m + k) = (m + k) — n + 2. ■ 

The complete graph K$ has n = 5 vertices and m = 10 edges; if K5 were planar, 
it would follow from Corollary 5.4.3 that 10 < 15 — 6. 

Not surprisingly, strengthening the hypothesis of Corollary 5.4.3 also strength- 
ens its conclusion. 



F+V = E + 2. 



(5.18) 



m < 3n — 6. 



(5.19) 
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(a) 




(b) 



Figure 5.4.3 



5.4.4 Corollary. 

vertices. Then 



Let G be a bipartite planar graph with m edges and n > 2 
m < In - 4. (5.20) 



The proof is similar. By Theorem 5.3.20, G has no odd cycles. So, this time, the 
minimal cycle length is 4, and it follows that 2m > 4r. Together with Euler's for- 
mula, this implies that \m>m~ n + 2. ■ 

Because the complete bipartite graph ^3 3 has n — 6 vertices and m = 9 edges, if 
^3,3 were planar, it would follow from Corollary 5.4.4 that 9 < 12 — 4. 

If G contains a nonplanar subgraph then G, itself, cannot be planar. Thus, any 
graph that contains a subgraph isomorphic to K$ or to ^3 3 cannot be planar. In 
1930, Kasimir Kuratowski proved a kind of converse, the statement of which 
involves a new idea. 

Let G = (V,E) be a graph with n vertices and m edges, of which e = {m, v} is 
one. To subdivide e means, informally, to put a new vertex of degree 2 in the middle 
of e. If H is the new graph, then V(H) = VU{w}, where w V, and 
E(H) — (E\{e}) U {{u, w}, {w, v}}. A subdivision of G is any graph that can be 
"constructed" from G by subdividing edges. The graph in Fig. 5.4.3a, for example, 
is a subdivision of K4; the graph in Fig. 5.4.3b is not. 



5.4.5 Definition. 

subdivisions. 



Graphs G\ and G2 are home omorp hie if they have isomorphic 



Informally, "homeomorphic" might be thought of as "isomorphic to within 
vertices of degree 2". In particular, any graph is homeomorphic to all of its subdi- 
visions. The graph in Fig. 5.4.4£> is homeomorphic to the complete graph K4 
illustrated in Fig. 5.4.4a. 




O 



(a) 



XT 
(b) 



XT 



Figure 5.4.4 



376 



Enumeration in Graphs 



5.4.6 Kuratowski's Theorem. IfG is not planar, then G has a subgraph homeo- 
morphic to K5 or to 

The proof of Kuratowski's theorem is beyond the scope of this text. 
Almost from its inception, the study of planar graphs has been associated with 
coloring problems. The following technical result is useful in this regard. 

5.4.7 Lemma. Let G be a planar graph with m edges, n vertices, and minimum 
vertex degree d n . Then d n < 5. 

Proof. If d n > 6, then 2m = ^2d(v) > 6n, contradicting Inequality (5.19). ■ 

5.4.8 Five-Color Theorem. If G is a planar graph, then %(G) < 5. 

Proof. The proof is by induction on the number of vertices of G. Since five colors 
suffice to properly color any graph on n < 5 vertices, planar or not, the induction is 
off to a good start. Let us take as our induction hypothesis that %(H) < 5 for every 
plane graph H on k vertices. Let G be a plane graph on n = k + 1 vertices. By Lem- 
ma 5.4.7, G has a vertex u of degree at most 5. Let H be the (plane) subgraph of G 
obtained by deleting vertex u and all the edges incident with it. By the induction 
hypothesis, x(^) < 5. If l(H) < 5, then we can "lift" a four-coloring of H to G 
and have a fifth color left over for u. So, we may assume x(^) — 5. 

Suppose H to be properly 5-colored. If d c (u) < 5 then, lifting the 5-coloring of 
H to G leaves a color available for u, i.e., the 5-coloring of H can be extended to a 
5-coloring of G. Thus, we proceed under the assumption that dc(u) = 5. 

Figure 5.4.5 illustrates u and its five neighbors in the plane graph G. If it happens 
that some two of Vi, V2, . . . , V5 are colored the same in H, then the 5-coloring of H 
can be extended to G. So, we come at last to the hard case in which vertex v, is 
colored c,, 1 < i < 5, and these colors are all different. 

Suppose there is a path in H from vi to v 3 , the vertices of which are alternately 
colored c\ and C3. Adjoining the path [V3, u, vi] results in a cycle. Either V2 is inside 
this cycle (as shown in Fig. 5.4.6), or V2 is outside and V4 and V5 are inside. Either 




O 



Figure 5.4.5 
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Figure 5.4.6 



way, there could not exist a path in H from V2 to V4, the vertices of which are alter- 
nately colored C2 and C4. (A path in 7/ from V2 to V4 is a path in the plane graph G, 
so it cannot cross any of the edges of our cycle. Because the colors are wrong, 
neither can it pass through a vertex of the cycle.) We deduce that there does not 
exist an alternating ci-c 3 path in H from vi to V3, or there does not exist an alter- 
nating C2-C4 path in H from V2 to V4. As these two cases are equivalent, we may as 
well assume there does not exist an alternating C1-C3 path in H from vi to V3. 

Perhaps no vertex of H is both adjacent to V\ and colored C3. If so, we can change 
the color of v\ from c\ to C3, freeing up color c\ for u. The rest of the proof is an 
extension of this idea. 

Let W be the set of all those vertices w <E V(H) such that there is an alternating 
C1-C3 path in H from v\ to w. (We are working under the assumption that v 3 ^ W.) 
Observe that if v G V(H) is colored either c\ or c 3 , and if v is adjacent to a vertex of 
W, then v G VK. Put another way, if v ^ W, but v is adjacent to some vertex in W, 
then v is not colored ci or C3. Consequently, if we interchange the colors of the 
vertices in W, the result is a new proper 5-coloring of H, one in which both v\ 
and V3 are colored c 3 . This frees up c\ for m. ■ 

Reviewing the proof of the five-color theorem, one cannot help but be struck by 
the uselessness of V5. It seems there ought to be a way to eliminate V5 and prove the 
following. 

5.4.9 Four-Color Theorem. //" G is a planar graph, then %{G) < 4. 

The earliest surviving reference to the four-color problem dates to the 1850s 
when Francis Guthrie mentioned it to his brother, Frederick, who happened to be 
a student of Augustus de Morgan. In an 1852 letter, de Morgan shared the problem 
with William Rowan Hamilton (who is known for many things, among them the 
Cayley-Hamilton theorem of linear algebra). By 1879, the problem had been 
widely circulated. In that year, the journal Nature announced that the four-color 
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Figure 5.4.7 



theorem had been proved by Alfred Kempe. It wans't until 1890 that Percy 
Heawood discovered an error in Kempe's proof. While he could not fix the mistake, 
Heawood was able to prove Theorem 5.4.8. Finally, in 1976, the four-color theorem 
was established by Kenneth Appel and Wolfgang Haken. Appel and Haken used 
more than 1000 hours of computer time to sort through a large number of cases. 
Their work is frequently cited in philosophical discussions about the nature of 
mathematical proof. 

The original four-color problem was stated in terms of properly coloring the 
regions of a plane graph. The connection between coloring regions and coloring 
vertices is via the notion of a geometric dual. If G is a plane graph with vertex 
set V(G) = {vi, V2, • • • , v„}, edge set E(G) = {e\, e^, . . . , e m }, and "region set" 
R(G) = {/i,/ 2 , • • ■ ,/r}, then R(G) = V(G d ) is the vertex set of its dual, G d . Ver- 
tices/; and/) are adjacent in G d if and only if regions/; and/; share an edge in 
G. Thus, there is a natural one-to-one correspondence between the edges of G d ; 
and the edges of G. If e € E(G), then e bounds two (not necessarily different) 
regions of G, say/; and/). The edge of G d corresponding to e is {fi,fj}. 

5.4.10 Example. It is frequently convenient to draw G d right on top of G. In 
such illustrations, a vertex of G d is placed in every region of G, and every edge 
of G is crossed by exactly one edge of G d . The situation for G = is illustrated 
in Fig. 5.4.7. The bad news is that G d can be a multigraph. In fact, there is more bad 
news. As illustrated in Fig. 5.4.8, the dual may even be a pseudograph, containing 
loops as well as multiple edges. (A loop is an "edge" from a vertex to itself.) 

□ 




Figure 5.4.8 
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5.4.11 Example. There is even more bad news. The plane graphs in Fig. 5.4.9 
are isomorphic, but their dual multigraphs are not! □ 

Despite these complications, every dual pseudograph G d has a unique underlying 
graph Gd and x(G d ) = %(G<i). Thus, coloring regions of G is the same as coloring 
vertices of Gj. Because Gd is also planar, Theorem 5.4.9 solves the original four- 
color problem. 



5.4. EXERCISES 

1 Prove that every tree is a planar graph. 

2 Use Equation (5.17) and Exercise 1 to prove that every tree on n vertices has 
m = n — 1 edges. (Compare with Exercise 20, Section 5.3.) 

3 In 1936, K. Wagner proved that every planar graph has a plane illustration 
in which each edge is represented by a straight line segment. Draw such a 



plane illustration of 

(a) K 5 - e. (b) ^3,3 - e. (c) G = 




4 In 1990, chemists synthesized the first fullerene, a molecule Ceo consisting of 
60 carbon atoms — and nothing else. This third form of carbon (the first two 
begin graphite and diamond) had been predicted by R. Buckminster Fuller. Less 
expected were C70, C76, Cs4, C90, and C94, all of which had been produced by 
1992. Every one of these higher fullerenes takes the shape of a convex 
polyhedron each of whose faces is either a pentagon or a hexagon. Prove that, 
for any such structure, the number of pentagonal faces is exactly 12 (Hint: Each 
vertex has degree 3.) 



'Wagner's paper, "Bemerkungen zum Vierfarbenproblem," appeared in Jahresberichte D. M. V. 46 
(1936), 26-32. The result was also discovered by I. Fary, On straight line representation of planar graphs, 
Acta. Set Math. Szeged Univ. 11 (1948), 229-233. 
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5 Redraw each of the following as a plane graph. (Number the vertices of your 
drawings to exhibit an isomorphism with the original graph.) 




6 5 6 5 



6 Let G be the graph in Fig. 5.4.10. 

(a) Prove directly, without using the four-color theorem, that %(G) = 4. 

(b) Prove that G is planar by redrawing it as a plane graph. (Number the 
vertices of your drawing so as to exhibit an isomorphism with G.) 

(c) Prove that Lemma 5.4.7 cannot be strengthened to the following: If G is a 
planar graph on n vertices, then d„ < 4. 



1 




7 

Figure 5.4.10 
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7 Prove or disprove the converse of the four-color theorem. 

8 What is the smallest number of edges among planar graphs of chromatic 
number 4? 

9 Let G = Ca V P3, the join of the square and the tree of Fig. 5.4.1 1. Is G planar? 
Justify your answer. 



O- 



C d = 
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Figure 5.4.11 



10 The graph G in Fig. 5.4.12 is the Petersen graph from Example 5.1.7. Prove 
that it is not planar by illustrating a subgraph of G that is homeomorphic to 
^33. (Hint: G will not be an induced subgraph.) 




Figure 5.4.12 

11 Prove that any planar graph on n > 2 vertices has two vertices of degree at 
most 5. 

12 Let G be a graph on n > 10 vertices. Prove that G and G c cannot both be 
planar. 

13 Let G be a plane projection of a cube (illustrated in Fig. 5.4.2). 

(a) Show that G d = Gd. (In other words, show that the dual pseudograph of G 
is, in fact, a graph.) 

(b) It turns out that G d can be drawn so that it is a plane projection of another 
regular polyhedron. Which one? 

14 Let G be a plane graph and consider G dd the dual of G d . 

(a) If G is connected, prove that G dd is isomorphic to G. 

(b) Illustrate G dd for the graph G having two components each of which is 
isomorphic to K3. 
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15 Let G and H be the plane graphs in Fig. 5.4.9. 

(a) Prove that G and H are isomorphic. 

(b) Show that G d and H d are not isomorphic. 

(c) Prove that Gd and H& are isomorphic. 

16 Let G be the plane graph obtained by projecting a regular tetrahedron (pyramid 
with a triangular base) onto the plane of its base. 

(a) Prove that G is isomorphic to K 4 . 

(b) Prove that G is isomorphic to G d . 

17 Illustrate a graph G that contains a subgraph homeomorphic to ^3 but that 
satisfies %(G) < 3. 

18 Because ^5 is not planar, it cannot be drawn in the plane without any edge 
crossings. However, if an over/underpass is erected on the plane, it is then 
possible to draw ^5 with no edge crossings. (See Fig. 5.4.13.) The minimum 
number of over/underpasses that are needed to draw a graph with no edge 
crossings is its genus. Thus, planar graphs have genus 0 and ^5 has genus 1. 




Figure 5.4.13. K$ with an over/underpass. 

(a) Prove that has genus 1 by drawing it (with no edge crossings) on a 
plane with one over/underpass. 

(b) Prove that K 3 3 has genus 1 . 

(c) Prove that has genus 1 . 

(d) In 1968, G. Ringel and J. W. Youngs proved that the genus of K n is 
\(n — 3) (n — A) 1 12] , where \x] is the smallest integer not less than x. Use 
this formula to show that Kj has genus 1. 

19 Given a plane graph H, explain why there exists a plane graph G such that 
G d =H. 

20 What does it mean to say that two plane graphs are isomorphic? Give a 
mathematical definition of plane graph isomorphism. 
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I find that the harder I work, the more luck I seem to have. 

— Thomas Jefferson 

Let's begin by giving formal definitions to two families of graphs that have been 
encountered several times already. 

5.5.1 Definition. Let V = {vi, v 2 , • • • , v„}. The path P„ = (V,E), where E = 
{{vj,Vj+i} : 1 < i < «}. The cycle C„ — (V,F), where F = ELI {{v n , vi}}. 

So, P„ is a path of length n — 1, and C n is a cycle of length n. 

5.5.2 Example. P l =K l ,P 2 = K 2 , C 3 = K 3 , 

O 



^3 = O- 



-o- 



-o, 



Pa = 




o- 



C A = 



o- 



,0, 



c = Q 



o 



o- 



and so on. 



-O 



-O 



-O 



□ 



Recall that a subset of V(G) is independent if no two of its vertices are incident 
with the same edge of G. One might naturally suppose that a subset of E(G) is inde- 
pendent if no two of its edges are incident with the same vertex. For historical 
reasons, independent sets of edges are called matchings. 

5.5.3 Definition. Let G be a graph. A matching of G is a set of edges, no two of 
which share a vertex. If M C E(G) is a matching, and if e = {u, v} S M, then u and 
v are said to be matched vertices, covered by M. An r-matching is a matching con- 
sisting of r edges. The matching number u(G) is the largest number of edges in any 
matching of G, i.e., the maximum value of r in any r-matching of G. 

A 1 -matching is a set consisting of a single edge covering two vertices. A 
2-matching is a set of two (nonadjacent) edges covering four vertices. The edges 
in a 3-matching cover six vertices, and so on. In particular, tt(G) < \n. 

5.5.4 Definition. Let G be a graph on n vertices. A perfect matching is a 
|«-matching, i.e., an r-matching where 2r = n. 



'Perfect matchings are sometimes called Kekule structures, after August Kekule, the chemist who showed 
that the carbon atoms of a benzene molecule arrange themselves at the vertices of a hexagon. 
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Figure 5.5.1 



5.5.5 Example. The three perfect matchings of K 4 are illustrated in Fig. 5.5.1. 
With its edges numbered 1-6, as illustrated in Fig. 5.5.2, the 1-matchings of C(, 

are {ei}, {ei}, ■ ■ ■ , { e 6\- There are nine 2-matchings, namely, {ei,e{\, {ei,e4}, 
{e u e 5 }, {e 2 ,e 4 }, {e 2 ,e 5 }, {e 2 ,e 6 }, {e?,,e 5 }, {e 3 ,e 6 }, and {e 4 ,e 6 }. The two perfect 
matchings of C(, are {e\, £3,65} and {e2,e4,e(,}. (In particular y.(Ce) = 3.) □ 

5.5.6 Definition. Suppose G is a graph on n vertices. Let q(G, r) be the number 
of r-matchings of G, r > 0, and define q(G, 0) = 1. The matching polynomial of 
G is 

M(G,x)=Y / (-lY<l(G,ry- 2r . (5.21) 

Let G= (V,E) be a fixed but arbitrary graph with n vertices and m edges. 
Because M is a 1-matching of G if and only if M = {e} for some e G E, 
?(G, 1) = m. Thus, 

M(G,x)=x"-mx"- 2 + --- (5.22) 

Equation (5.22) bears a striking resemblance to the chromatic polynomial 
p(G,x) — x" — mx"~ l + • • • . One of the most striking differences is that q(G, r), 
the number of r-matchings of G, is a coefficient of M(G,x), whereas p(G, r), the 
number of proper colorings of G, is a value of p(G,x). 

5.5.7 Example. From Example 5.5.5, the matching polynomial M(Cs,x) = 
x 6 - 6x 4 + 9x 2 -2. □ 



1 




4 



Figure 5.5.2 

First introduced by H. Hosoya in a paper on chemical thermodynamics [Bull. Chem. Soc. Japan 44 
(1971), 2332-2339], chemists still refer to M(G,x) as the acyclic polynomial. At roughly the same time, 
O. J. Heilmann and E. H. Lieb used the same notion in a paper in statistical mechanics [Commun. Math. 
Phys. 25 (1972), 190-232]. 
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What's missing from the discussion so far is a convenient way to produce 
M(G,x), one that does not involve having to count, much less list, all the matchings 
of G. What's needed is an analogue of chromatic reduction. 

5.5.8 Definition. Suppose u G V, where G = (V, E) is a graph with at least two 
vertices. Denote by G — u the subgraph of G induced on W = V\{u}, i.e., 
G - u = (W, F) where F = E n W< 2 >. 

Informally, G — u is the graph obtained from G by deleting vertex u and all the 
edges incident with it. Note that extracting a vertex from G involves a more invasive 
kind of surgery than removing an edge. When edges are removed, the vertices are 
left undisturbed, V(G - e) = V(G). 

If H = G-u and w G W = V(H), then H — w=(G—u)—w is denoted 
G — u — w, which brings us to the matching analogue of chromatic reduction. 

5.5.9 Theorem. Let G=(V,E) be a graph with n>2 vertices. Suppose 
e = {u, w} G E. Then 

M(G,x) = M(G - e,x) - M(G -u- w,x). (5.23) 

Proof. The number of r-matchings of G that do not contain edge e is q(G — e, r) 
The r-matchings that do contain e are in one-to-one correspondence with the 
(r — l)-matchings of G — u — w. Thus, 

q(G,r) = q(G-e,r)+q(G-u-w,r-l), r>\. (5.24) 

Now, q(G 7 r) is the coefficient of (— l) r x n ^ 2r in M(G, x) and q(G — e, r) is the coef- 
ficient of (— \) r x n ~ 2r in M(G — e,x). But, q(G — u — w, r — 1) is the coefficient of 
(-l) r ~ V"- 2 )- 2 ''- 1 ' = -{-\) r x n - lr in M(G -u-w,r-l), i.e., it is the coeffi- 
cient of (— l)'x"~ 2 '' in —M(G — u — w, r — 1). In other words, Equation (5.23) is 
the polynomial equivalent of Equation (5.24). ■ 

5.5.10 Corollary. Suppose G = (V, E) is a graph on n vertices. Let u be a ver- 
tex of G of degree d{u) = k < n — 2. Suppose w,-, 1 < i < k, are the vertices of G 
adjacent to u. Then 

k 

M(G,x) =xM(G-u,x) - ^M{G-u-w h x). (5.25) 

i=i 

Proof. The proof is by induction on k. If k = 0, then u is an isolated vertex. In that 
case, q(G, r) — q(G — u, r) for all r, and 

M(G,x) = x" - q{G, l)x"- 2 + q(G, 2)x"" 4 

= x" - q(G - u, l)x n - 2 + q(G - u, 2)x"" 4 

= - q(G - u, l)x"- 3 + q(G - u, 2)x"~ 5 ] 

= xM(G-u,x). (5.26) 
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If k > 0, let e = {u, wt}- If H = G — e then, from Equation (5.23), 

M{G,x) =M(H,x) -M(G-u-w k ,x). 

Because d H (u) — k — 1 and H — u = G — u it remains to apply the induction 
hypothesis to M(H,x). ■ 

5.5.11 Example. Equation (5.22) suffices to determine that M(P 1} x)=x, 
M(P 2 ,x) = x 2 — 1, and M(P 3 ,x) = x 3 - 2x. If n > 1, then P n+ \ has a vertex u of 
degree 1 and, by Equation (5.25), 

M(P n+l ,x) =xM{P n ,x) -M(P„_i,jc). (5.27) 

So, M(P 4 ,x) = x(x 3 - 2x) - (x 2 - 1) = x 4 - 3x 2 + 1. Similarly, M(P 5 ,x) = x 5 - 
4x 3 + 3x, M(P 6 ,x) = x 6 - 5x 4 + 6x 2 - 1, and so on. □ 

5.5.12 Example. Theorem 5.5.9 lends itself to the same kind of picturesque 
usage as chromatic reduction. If G = C$, for example, Equation (5.23) can be 
expressed as 









(In the matching analogue of chromatic reduction, vertices are not coalesced; 
they are removed.) This picturesque equation is equivalent to M(Ce,x) = 
M(P 6 ,x) - M(P 4 ,x). From Example 5.5.11, Af(P 6 ,x) = x 6 - 5x 4 + 6x 2 - 1 and 
M(P 4 ,x) =x 4 -3x 2 + 1. Hence, M(C 6 ,x) = x 6 - 6x 4 + 9x 2 - 2, confirming 
Example 5.5.7. □ 

5.5.13 Example. Let's compute the matching polynomial of K n . From Equation 
(5.22), M(K u x) =x, M(K 2 ,x) = x 2 - l,andM(^ 3 ,x) = x 3 - 3x. From Fig. 5.5.1, 
M(Kn,x) — x 4 — 6x 2 + 3. If n > 1, then K n+ \ — u = K n and K n+ \ — u — w — K n -\. 
So, from Equation (5.25), 

M{K n+l ,x) = xM(K n ,x) - nM{K n _ u x), n>2, (5.28) 

Thus, e.g., 

M(K 5 ,x) = xM(K 4 ,x) - 4-M(K 3} x) 

= x(x 4 - 6x 2 + 3) - 4(x 3 - 3x) 
= x 5 - 10x 3 + 15x. (5.29) 

□ 
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The so-called Hermite polynomials' are recursively defined by h\(x) = x, 
hi(x) = x 2 — 1, and h n+ \{x) = xh„(x) — nh n -i(x). These polynomials first appeared 
as solutions to the second-order, linear differential equation 

y" - xy' + ny = 0. 

It follows from Example 5.5.13 that M(K„,x) = h n (x), n > 1. (It turns out that the 
polynomials M(P n ,2x) are also well known. They are Chebyshev polynomials of 
the second kind.'') 



What about doing some of these calculations by computer? One way to enter a 
graph into a computer is by means of a matrix. 



5.5.14 Definition. Let G = (V, E) be a graph with vertex set V = {1, 2, . . . , n}. 

The n x n adjacency matrix A(G) = (ay) is defined by 

1 if {i,j}€E, 
0 otherwise. 



fly; = 



(5.30) 



It is clear from the definition that A(G) is a symmetric, (0, l)-matrix whose main 
diagonal consists entirely of 0's, and that the number of l's in row ;' of A(G) is 
dc(i), the degree of vertex i. What about the other way around? Suppose you are 
given an arbitrary n x n, symmetric, (0, l)-matrix A = (ay) with zeros on the 
diagonal. Must it be the adjacency matrix of some graph? Yes, and it is easy to 
see how to illustrate the graph. Draw n vertices in the plane, number them from 
1 to n, and draw an arc from vertex i to vertex j precisely when ay = 1 . 

Obscured by the notation is the fact that A(G) depends, not only on G, but on the 
numbering of its vertices. If G\ and G2 are the (isomorphic) graphs of Fig. 5.5.3, 
then 



A(G0 = 



/0 1 1 \\ 

10 0 0 

10 0 1 

\\ 0 1 0/ 



and A(G 2 ) 



(0 


0 


0 


i\ 


0 


0 


1 


1 


0 


1 


0 


1 


V 


1 


1 


V 




After Charles Hermite (1822-1901). Among Hermite's students was the eminent mathematician Jules 
Henri Poincare (1854-1912). 
f After Pafnuti Chebyshev (1821-1894). 
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are different matrices. How different? To answer this question, let / be the permu- 
tation (1432) G S 4 . Then/ : V(Gi) -> V(G 2 ) is an isomorphism of Gi onto G 2 . 
Corresponding to / is a permutation matrix P(f) = (5,^)), i.e., 



/o 


i 


0 


°\ 


0 


0 


1 


0 


0 


0 


0 


1 




0 


0 


°) 



(5.31) 



is the matrix obtained by permuting the columns of the identity matrix I 4 according 
to the permutation / (an elementary column operation). The connection between 
A(Gi) and A(G 2 ) is given by 



A(G 2 )=P(f)A(G l )P(f)- 1 . 



(5.32) 



(Because P(f) is a permutation matrix, its inverse is equal to its transpose, i.e., 

P{fY l = P(f)'-) 

Conversely, if A(G 2 ) = PA(G\)P 1 for some permutation matrix P, then there is 
a permutation f G. S„ such that P — P(f). Moreover,/ : V(Gi) — > V(G 2 ) is an iso- 
morphism. Let's summarize these observations. 



5.5.15 Theorem. Graphs G\ and G 2 are isomorphic if and only if their adja- 
cency matrices are permutation similar, i.e., if and only if there is a permutation 
matrix P such that A(G 2 ) = PA(G\)P~ l . 

Theorem 5.5.15 opens a window on a new class of invariants. 

5.5.16 Corollary. Graphs G\ and G 2 are isomorphic only if A{G\) and A(G 2 ) 
have the same characteristic polynomial, i.e., only if det(xl n — A(Gi)) — 
det(xl n -A(G 2 )). 

Proof. From linear algebra, two real symmetric matrices are similar if and only if 
they have the same characteristic polynomial. ■ 

Another perspective from which to view Corollary 5.5.16 is this: If nineteenth- 
century linear algebraists had a quest, it was to solve the similarity problem by find- 
ing a short list of easily computed (similarity) invariants sufficient to determine 
when two matrices are similar. That quest was successfully completed long ago, 
at least for matrices over the real numbers. For real symmetric matrices, one 
such list has a single entry, the characteristic polynomial. This raises some interest- 
ing questions. For starters, might the adjacency characteristic polynomial solve the 
graph isomorphism problem? An equivalent formulation of the question is this: Can 
two symmetric (0, l)-matrices be similar without being permutation similar? That 
the answer to the reformulated question is yes will be confirmed momentarily. 

While det(x/„ — A(G)) does not solve the graph isomorphism problem all by 
itself, neither do any of the other invariants we have studied. Our situation is not 
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Figure 5.5.4 



unlike that of a physician trying to treat a patient suffering from some particularly 
stubborn disease. If no single drug cures the patient, why not try a combination of 
drugs? In medicine, mixing drugs can have fatal consequences. While the graph- 
theoretic analogue may be less vital, it is no less interesting: To what extent 
is the new invariant, in this case det(x/„ — A(G)), independent of other invariants? 
To address this question, define a forest to be an acyclic graph, i.e., a graph without 
any cycles. Then G is a forest if and only if each of its connected components is 
a tree. 

5.5.17 Theorem. Let G be a graph on n vertices. Then G is a forest if and only if 
det(xl n — A(G)) = M(G,x). 

Proof Sketch. If V(G) = {1,2, ... ,n}, then det(xl n — A(G)) is an alternating sum 
of n\ products, one for each permutation of {1,2, ... ,n}. The product correspond- 
ing to p G S„ is nonzero if and only if {i,p(i)} G E(G) for all i ^ p(i). In particular, 
there is a one-to-one correspondence between the r-matchings of G and the nonzero 
products arising from permutations of cycle type [2 r , l n ~ 2r ]. This correspondence 
yields det(x/„ — A(G)) = M(G,x) + terms involving cycles of G. If G is acyclic, 
the proof is complete. Otherwise, one must show that the added terms make a non- 
zero contribution. ■ 

5.5.18 Example. Let T\ and T 2 be the trees illustrated in Fig. 5.5.4. Then 
M(T u x) =x 8 -7x 6 + 9x 4 =M(T 2 ,x). (Confirm it.) It follows from Theorem 
5.5.17 that det(jc/ 8 - A(7\)) = det(x/ 8 - A(T 2 )), so A(T X ) and A(T 2 ) are similar. 
Because T\ and T 2 are not isomorphic, A{T\) and A(T 2 ) cannot be permutation 
similar. □ 

5.5.19 Example. If G = K 3 = C 3 , then 




Because G is not a forest, it follows from Theorem 5.5.17 that det(x/3 — 
A(G)) ^ M(G,x). Indeed, det(x/ 3 - A(C 3 )) = x 3 - 3x - 2, whereas M(C 3 ,x) = 
x 3 - 3x. (Check it.) □ 



390 



Enumeration in Graphs 



5.5. EXERCISES 



1 Use Definition 5.5.6 to confirm directly that M(P^,x) — x 6 — 5x A + 6x 2 — 1. 
(Make a list of all six 2-matchings.) 

2 Compute the matching polynomial of 

O O 



(a) O- 



-O- 



-O- 



-o 



(b) O- 



-o- 



-o 



o 




3 Compute 

(a) M(K 6 ,x). (b) M(K 7 ,x). (c) M(P 7 ,x). 
(d) M{P % ,x). (e) M{C 7 ,x). (f) M{C 8 ,x). 

4 Let k n be the number of perfect matchings in the complete graph K n , n > 2. 

(a) Compute fc 3 . 

(b) Compute £4. 

(c) Compute k 6 . 

(d) Prove that k n+2 = (n + l)k n , n > 2. 

(e) Prove that k 2r is odd, r > 1 . 

5 Prove that M(G,x) is an invariant. 

6 Prove that M(d + G 2 ,x) = M(G u x)M{G 2 ,x). (See Definition 5.3.10 for the 
definition of graph union.) 

7 Let G — (V, E) be a graph. An r-matching of G is maximal if it is not properly 
contained in another matching of G. An r-matching is maximum if r — 11(G). 

(a) Explain why every maximum matching is a maximal matching. 

(b) Give an example of a graph G with a matching M that is maximal but not 
maximum. 

8 Let G be a graph on three or more vertices. Suppose u and w are nonadjacent 
vertices of G. If G + e is the graph obtained from G by adding a new 
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edge e = {u, w}, then Equation (5.23) can be written, in "reverse-angle" form, 
as 

M(G,x) = M(G + e,x) + M(G - u - w,x). 
Use this formula, along with Example 5.5.13, to compute 
(a) M{K 5 -e,x). (b) M(K 6 - e,x). 

9 Let G — (V, E) be a graph on n vertices. A subset K C V is a covering of G if, 
for all e G E, there is a v e A" such that v£e. (Note that the word "cover" is 
being used a little differently here than in Definition 5.5.3.) The covering 
number (3(G) = min o(K), where the minimum is over all coverings of G. The 
independence number oc(G) = max o(S), where the maximum is over all 
independent sets S C V. 

(a) Find a connected graph G such that <x(G) < P(G). 

(b) Find a connected graph G such that oc(G) > (3(G). 

(c) Show that x(G) < 1 + (3(G). 

(d) Show that oc(G) + (3(G) = «. 

(e) Show that %(G) + P(G C ) > n. 

(f) Show that u(G) < (3(G). 

(g) D. Konig proved that |x(G) = (3(G) for any bipartite graph G. Find a 
nonbipartite graph G for which u(G) = P(G). 

10 It can be shown that the derivative of the matching polynomial is given by the 
equation 

D x M(G,x) = ^M(G - u,x). 

uev 

(a) Use this result to prove that the Hermite polynomials satisfy the identity 

D x h n (x) = n/z„_i(x), n > 2. 

(b) Use Exercise 4(c) and part (a) of this exercise to obtain M(K^,x) by 
antidifferentiating Equation (5.29). 

11 It can be shown that 

M{G\x) = J2q{G, r)M{K n _ 2r ,x), 

r>0 

where M(K 0 ,x) — 1. Confirm this formula for the self-complementary graph 
(a) P 4 . (b) C 5 . 

12 Consider the matrices 



i) - b =(i ?)■ 
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(a) Prove that they are not similar. 

(b) Show that they have the same characteristic polynomial, namely, (x — l) 2 . 
13 For each graph in Fig. 5.5.5, compute 



O- 




14 



15 



-O- 



-o- 




-o- 



o 



-o- 




-o o- 



Figure 5.5.5 



-Q 



O 



-O- 




-O- 



-O- 



-O 



(a) its degree sequence. 

(b) its chromatic polynomial. 

(c) its matching polynomial. 

If A is a real symmetric matrix, then its characteristic roots are all real. It 
follows that A(G) has n real eigenvalues j x (G) > J 2 (G) > ■ ■ ■> y„(G). 
Compute these (graph) invariants for 

(a) G = K 3 . (b) G = P 3 - (c) G = K 4 . 

(d) G=C 4 . (e) G = ^i, 3 . (f) G = K 2 \/K C 3 . 



If Ti(G) >y 2 (G) > 
Yl (G) + Y 2 (G) + "-- 



• • > y„(G) are the eigenvalues of A(G), show that 
Y„(G) = 0. 



16 Prove that the eigenvalues of A(K n ) (see Exericse 14) are n — 1 with multi- 
plicity 1, and —1 with multiplicity n—\. 

17 It follows from Theorem 5.5.17 (and Exericse 14) that the roots of M(G,x) are 
all real whenever G is a forest. In fact, the roots of M(G, x) are all real for any 
graph G. Moreover, if a\ > a 2 > ■ ■ ■ > a n are the roots of M(G,x) and 
b\ > b 2 > ■ ■ ■ > b n -\ are the roots of M(G — u,x), then the b's interlace the 
a's, i.e., > bi > a,-+i, 1 < i < n. Confirm that the roots of M{Ki,x) interlace 
the roots of M(K 5 ,x). 

18 Confirm that the number of different roots of M(G, x) is greater than the length 
of a longest path in G when 

(a)G = J P 3 . (b)G = P 4 . 



(c) G 



(d) G 



19 Let G be a connected graph. The edge connectivity e(G) is the smallest number 
k for which there exist edges ei, e 2 , ■ ■ ■ , € E(G) such that G — e\ — e 2 — 
■ ■ ■ —ek is disconnected. If G = K n , the vertex connectivity k(G) = n — 1. 
Otherwise, k(G) is the smallest number k for which there exist vertices 
Hi, M2, . . . ,k* S V(G) such that G — u\ — u 2 — ■ ■ ■ — is disconnected. 
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(a) Prove that s(G) < d n (G), the minimum vertex degree. 

(b) Prove that k(G) < e(G). 

(c) Suppose G ^ K„. If k(G) = 1, then there is some vertex u G V(G) such 
that G — u is disconnected. Such a vertex is called a cm? vertex. A Woc£ of 
G is a maximal subgraph that doesn't have a cut vertex. Prove that the 
chromatic polynomial of a graph is uniquely determined by the chromatic 
polynomials of its blocks. 

20 Let det(x/„ — A(G)) = x" + c\x n ~ l + ■ ■ ■ + c n be the characteristic polynomial 
of A(G). In 1963, H. Sachs proved that 

c . = J2(-l) cW 2 k W, 

H 

where the summation extends over all /-vertex subgraphs H of G whose 
connected components are either single edges or cycles, and where c(H) and 
k(H) are the numbers of components and cycles, respectively. Use Sach's 
theorem to compute det(x/„ — A(G)) for the graph 

(a) K 3 . (b) P 3 . (c) K 4 . 

(d) C 4 . (e)G = ^i, 3 . (f)G = K 2 VKl 
(Hint: Your answer(s) should be consistent with Exercise 14.) 

21 Use Sachs's theorem (Exericse 20) to prove Theorem 5.5.17. 

22 Prove Sach's theorem (Exercise 20). 

23 Recall (Exercise 19, Section 3.5) that the permanent of an n x n matrix 
A = (fly) is defined by 

n 

per(A) = II a 'PC)- 
pes„ f=l 

If G is a bipartite graph, then the number of perfect matchings in G is the 
square root of the permanent of A(G). Confirm this formula if 

(a)G = K L3 . (b) G = P A . (c)G=C 4 . 

24 Show that per(A(G)) is a (graph) invariant. (See Exercise 23.) 

25 Important to the theory of matchings is the concept of adjacent edges. This 
notion arises in other contexts as well. Associated with graph G is its line 
graph, G*. The vertex set of G# is V(G#) = E(G), i.e., the vertices of G* are 
the edges of G. The edges of G# are those pairs of its vertices that are adjacent 
edges in G. 

(a) Show that the line graph of K\ is isomorphic to K(, — M, where M is a 
perfect matching. 
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Figure 5.5.6 



(b) Show that the line graph of the wheel, W = K\ V C5, is isomorphic to the 
graph in Fig. 5.5.6. 

26 A walk in G of length r is a sequence of vertices u$,u\, . . . ,u r in which 
{«,_!,«,} G E(G), 1 < i < r. (A path is a walk consisting of distinct vertices.) 
If V(G) = {1,2,..., n}, 

(a) prove that the number of walks in G of length r, from vertex i to vertex 7, is 
the (/,;>entry of A(G) r . 

(b) prove that the distance from vertex i to vertex j is the smallest value of k 
such that the (fj')-entry of A(G) k is not zero. 

27 Give a formal proof of Theorem 5.5.15. 

28 Let G be a graph on re vertices. The Hosoya topological index of G is 

L«/2J 

ff(G)= £>(G,r). 

(a) Show that H{P { ) = 1 and H(P 2 ) = 2. 

(b) Show that #(P„ + i ) - //(P„) + ff(P„_i), re > 2. 

(c) Show that H(P n ) = F n , the reth Fibonacci number, n > 1. (See Section 1.2, 
Exericse 19.) 

(d) Show that H(C n ) = F„ + F„- 2 , n > 3. 

5.6. ORIENTED GRAPHS 

Destiny is not a matter of chance, it is a matter of choice. 

— William Jennings Bryan 

If G = (V,E) is a graph with re vertices and rei edges then, by definition, E is an 
rez-element subset of V< 2 >. Not to be confused with the cartesian product 
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V x V = {(«, v) : u, v € V}, whose elements are ordered pairs of vertices, the 
elements of are unordered. 

5.6.1 Definition. An orientation of G — (V,E) is a function / : E — > V x V 
such that, for all e — {u, v} € £, the oriented edge f(e) is one of (it, v) or (v, it). 

By the fundamental counting principle, a graph with m edges has 2 m orienta- 
tions. By convention, the number of orientations of the edgeless graph K% is 2° = 1 . 

An oriented graph is a graph with a nonempty set of edges and some prescribed 
orientation. The situation in which G is oriented by/, and /(e) = (it, v) for some e = 
{u, v} € E(G), is summarized by referring to e = (it, v) as an oriented edge of G. 

If e = (it, v) is an oriented edge of G, then vertex v is the /ieacf of e, and vertex 
u is its tail. Consistent with this language, e is typically illustrated by a 
directed arc, or arrow, from u to v. 

5.6.2 Example. Suppose four ultimate frisbee teams enter a round-robin tourna- 
ment in which they are seeded (ranked) 1^1. The outcome of such a tournament can 
be illustrated by an orientation of K\ in which oriented edge e = (it, v) indicates 
that team u won its match with team v + . In the outcome illustrated by Fig. 
5.6.1a, e.g., team 1 fulfilled the expectations of the organizers by beating every 
other team in the tournament. On the other hand, having lost all of its games, 
team 2 seems to have underperformed. 

The notorious intransitivity of athletic competitions is illustrated in Fig. 5.6. 1Z?. 
Represented here is a tournament in which team 1 beat team 3 and team 3 beat team 
2, but team 1 lost to team 2. (Unlike the first outcome, some sort of tie-breaking 
procedure will be required to determine the championship team in the second tour- 
nament.) □ 

5.6.3 Definition. A directed path of length r in the oriented graph G is a path 
[wq, wi, . . . , w r ] in which (w,_i, w,) is an oriented edge of G, 1 < i < r. A directed 




1 

O 



1 

o 



4 



3 



4 



3 



Figure 5.6.1 



*An oriented graph is a special kind of directed graph in which at most one of (w, v) and (v, u) can be an edge. 
+ This has become such a widely accepted model for round-robin tournaments that oriented complete graphs 
have come, themselves, to be known as tournments. 
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cycle of length r in G is a cycle {w\ , w% . . . , w r ) in which (w r , wi ) and (w,-_i, w,-), 
1 < i < r, are oriented edges. 

5.6.4 Example. The oriented graph illustrated in Fig. 5.6. lfo contains three 
directed cycles: (1,3,2), (1,4,2), and (1,3,4,2). The oriented graph in 
Fig. 5.6.1a has none. □ 

5.6.5 Definition. An orientation of G is acyclic if it contains no directed cycles. 

Because a tree has no cycles at all, each of its orientations is acyclic. What about 
some arbitrary graph having m edges? Of its 2'" orientations, how many are acyclic? 

5.6.6 Stanley's Theorem.* If G is a graph with n vertices, m edges, and chro- 
matic polynomial p(G,x), then the number of acyclic orientations of G is 
(-l)"p(G,-l). 

Proof Sketch. Let c(G) be the number of acyclic orientations of G and set 
p(G) = p(G, — 1). The heart of the proof lies in showing that c(G) — p(G) = 
c(G - e) - p(G - e), e G E(G). Because c(K c n ) = p(K c n ) = 1, this yields a proof 
by induction on m. Details are omitted. ■ 

5.6.7 Example. Given that p(Kn,x) = x(x — l)(x — 2)(x — 3), we can use 
Stanley's theorem to determine that, of the 64 orientations of K4, (— l) 4 x 
p(K4, — 1) = 4! = 24 are acyclic. 

If G = C„, then G has n edges and 2" orientations. According to Exercise 9(c) of 
Section 5.3, 

p(C n ,x) = (x-l) n + (-l) n (x-l). 

So, by Stanley's theorem, C n has 2" — 2 acyclic orientations. Indeed, the two 
remaining orientations might well be labeled clockwise and counterclockwise. 

If T is a tree on n > 2 vertices then, by Theorem 5.3.16, p(T,x) — x(x — l) ra_1 . 
So, T has (—l) n p(T, — 1) = 2" _1 acyclic orientations. Because T has m = n — 1 
edges, it has a total of 2" _1 orientations, confirming that every orientation of every 
(nontrivial) tree is ayclic. □ 

5.6.8 Definition. Suppose G=(V,E) is an oriented graph with vertex set 
V = {1, 2, . . . , n} and edge set E = {e\, ei, . . . , e m }. Let Q(G) = (qy) be the 
« x m matrix defined by q t j = 1 if vertex i is the head of edge ej, — 1 if i is the 
tail of ej, and 0 otherwise. Then Q(G) is an oriented vertex-edge incidence matrix 
for G. 



*R. P. Stanley, Acyclic orientations of graphs, Discrete Math. 5 (1973), 171-178. 
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It can be useful to think of Q(G) as a vertex-by-edge matrix. If oriented edge 
e = (u,v), then column e of Q(G) contains precisely two nonzero entries: —1 in 
row u and +1 in row v. The number of nonzero entries in row w of Q(G) is 
d G (w), the degree of vertex w. 

5.6.9 Example. Let G = K4, numbered and oriented as in Fig. 5.6.1a, following. 
If the edges of G are numbered in dictionary order, i.e., if e\ = {1,2}, e-i = {1,3}, 
e 3 = {1,4}, e A = {2,3}, e 5 = {2,4}, and e 6 = {3,4}, then 



Q(G) = 



/-I 


-1 


-1 


0 


0 


o\ 


1 


0 


0 


1 


1 


0 


0 


1 


0 


-1 


0 


-1 


V 0 


0 


1 


0 


-1 


1/ 



If H = K4, with the same numbering of vertices and edges, but with the orientation 
illustrated in Fig. 5.6. lb, then Q(H) differs from Q(G) by the signs of the entries in 
its first column. □ 

As usual, denote by Q l the transpose of Q = Q(G) = (qy), i.e., the m x n matrix 
whose (ij)-entry is qjj. 

5.6.10 Theorem. Let G be a graph with vertex set V(G) = {1, 2, . . . , «}. // 

<2 = Q(G) is an oriented vertex-edge incidence matrix corresponding to some 
orientation of G and some numbering of its edges, then the (ij)-entry of QQ' is 



(QQ')ij 



' d G {i) Hj=i, 

-1 if i^j and {i,j}e E(G), 
0 otherwise. 



While Q = Q(G) depends both on the orientation and the numbering of the 
edges of G, it follows from Theorem 5.6.10 that QQ 1 depends on neither. 



Proof of Theorem 5.6.10. From the definitions of transpose and matrix multiplica- 
tion, the (ij)-entry of QQ 1 is 



£(GU2% = !>><?,>. (5.33) 

r=l r=l 

If i = j, then qi r qj r = qj r , and Equation (5.33) is the sum of the squares of the entries 
in row i of Q(G). Since q ir is ±1 when vertex i is incident with edge e r , and 0 other- 
wise, the sum of qj r is precisely d G (i). 

If i 7^ j, then qi r qj r ^ 0 if and only if q ir 7^ 0 7^ qj r , if and only if {i,j} = 
e r G E(G), if and only if qt r qj r = —1. Hence, the (ij')-entry of QQ 1 is —1 when 
{i,j} G E(G), and 0 otherwise. ■ 
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5.6.11 Definition. If G is a graph with vertex set {1,2, ...,«}, let D{G) = 
diag(d G (l),d c (2), . . . ,d c (n)) be the nxn diagonal matrix of vertex degrees. 
The Laplacian matrix L(G) — D(G) —A(G), where A(G) is the adjacency matrix 
of G. 

5.6.12 Corollary. Let G be a graph with vertex set V = {1, 2, . . . , n}. If 

Q = Q(G) is an oriented vertex-edge incidence matrix with respect to some fixed 
but arbitrary numbering of the edges of G, then QQ' = L{G). 

Proof. Immediate from Theorem 5.6.10 and Definition 5.6.11. ■ 

5.6.13 Example. If H is the graph in Fig. 5.6.2a, then 



L(H) 



( 


1 


0 


0 


0 


-1\ 




0 


2 


-1 


0 


-1 




0 


-1 


3 


-1 


-1 




0 


0 


-1 


2 


-1 


\ 


-1 


-1 


-1 


-1 





With respect to the orientation exhibited in Fig. 5.6.2b and the edge numbering 

e x = (1,5), e 2 = (2,3), e 3 = (5,2), e 4 = (3,4), e 5 = (5,3), and e 6 = (5,4), 



Q(H) = 



/-I 


0 


0 


0 


0 


o\ 


0 


-1 


1 


0 


0 


0 


0 


1 


0 


-1 


1 


0 


0 


0 


0 


1 


0 


1 


V i 


0 


-1 


0 


-1 


-1/ 



It is left to the reader to confirm that Q{H)Q(H) 1 = L(H). 



□ 



Let A be a generic nxn matrix and denote by Ay the (n — l)-square submatrix 
of A obtained by deleting its z'th row and jth column. Recall from linear algebra 
that the classical adjoint (or adjugate) of A, call it A^ , is the n x n matrix whose 
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(zj) -entry is (— 1)' +J det(Aji) . The result which makes classical adjoints worth 
knowing about is this: 

AA ] = det(A)/„. (5.34) 

It is from Equation (5.34) that one obtains the formula A -1 = [det(A)]~'At when- 
ever det(A) ^ 0. 

If G is a fixed but arbitrary graph on n vertices then L{G)Y n = 0, where (in this 
section) Y n is the n x 1 column vector each of whose entries is 1 . This is because 
the number of l's in row i of A(G) is equal to dc(i), the (z, ;')-entry of D(G). It fol- 
lows that rank L(G) < n, so 



det(L(G)) = 0. 



Setting A = L(G) in Equation (5.34) gives L{G)L{G) ] = 0, from which it 
follows that L(G)C = 0 for every column C of L{G) ] . If rank L(G) < n - 2, this 
is perfectly understandable because, in that case, C = 0. On the other hand, if 
rank L(G) = n — 1, then L(G)C = 0 if and only if C is a multiple of Y n . In either 
case, 



L(G)t 



/a b c 
a b c 



\a b 



d\ 
d 

d) 



(5.35) 



where a,b,c,.. . , and d are constants. Since det(A) = det(A'), the classical adjoint 
of a symmetric matrix is symmetric. Thus, the numbers in the first column of L(G)^ 
equal the numbers in its first row. From Equation (5.35), this means all the entries 
o/L(G) t are equal, and it proves the following. 

5.6.14 Theorem. If G is a graph on n vertices, then there exists an integer t(G) 
depending only on G such that 

?(G) = (-l) i+y det(L(G), 7 ), l<ij<n. 

Moreover, t(G) — 0 if and only if rank L(G) < n — 2. 

When an integer emerges in a combinatorial setting, it is natural to expect that it 
counts something. 

5.6.75 Definition. Let H = (W, F) be a subgraph of G = (V, E). If W = V, then 
H is a spanning subgraph of G. A spanning tree is a spanning subgraph that is a tree. 

A spanning subgraph is one that uses all of the vertices and some of the edges. In 
particular, graph G has only one induced spanning subgraph, namely, G itself. 
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5.6.16 Example. The graph in Fig. 5.6.2a has the eight spanning trees illustrated 
in Fig. 5.6.3. □ 

5.6.17 Matrix-Tree Theorem. If G is a graph, then f(G) is the number of 
different spanning trees in G. 

Proof Sketch. By Theorem 5.6.14, it suffices to compute, say, the (1, l)-entry 
of L{G)^ . Because L(G) — Q(G)Q(G)\ this computation can be done using a 
classical (nineteenth-century) result known as the Cauchy-Binet determinant 
theorem. The effect of this computation is to express t(G) as a sum of squares of 
(n — l)x (n — 1) subdeterminants of Q. Finally, by an old result of Poincare, these 
subdeterminants have absolute value 1 or 0, depending on whether they correspond 
to edges in a spanning tree or not. The details are beyond the scope of this 
book. 3S 



5.6.18 Example. If H is the graph in Fig. 5.6.2a, then, by Example 5.6.16, the 



spanning tree number t(H) 
Example 5.6.13, 



L(H) 



8. Let's use Theorem 5.6.14 to compute t(H). From 



0 
2 

-1 
0 

-1 



0 

-1 

3 
-1 
-1 



0 
0 

-1 

2 
-1 



-1\ 

-1 
-1 
-1 
4/ 



'Theorem 5.6.17 concerns the number of different spanning trees. The fact that there are numerous 
isomorphisms among the trees in Fig. 5.6.3 is irrelevant to the computation of t(H). 
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To compute, say, the (5, 3)-entry of L{H)\ take the product of (-1) 5+3 and the 
determinant of the matrix obtained from L(H) by deleting its third row and fifth 
column, i.e., 

/ 1 0 0 0\ 
0 2-1 0 
0 0-12 
V-l -1 -1 -1/ 



t(H) = (-1) 8 det 



Expanding this determinant along the first row yields 



/ 2 



t(H) = det 



0 

V-i 

2 detf 



1 0\ 
1 2 

1 -1/ 

-1 2 

-1 -1 

= 2(3) + 2 = 8. 



+ det 



0 2 

-1 -1 



□ 



Because it is a symmetric matrix, the eigenvalues of L(G) are all real. Indeed, 
because L(G) = Q(G)Q(G)\ its eigenvalues are all nonnegative! 

5.6.79 Definition. If G is a graph on n vertices, the spectrum of L(G) is denoted 
s(G) = {Xi{G),X 2 {G),...,X n {G)), where 



h(G) > X 2 (G) >■■■> X„(G) > 0 
are the eigenvalues of L(G) arranged in nonincreasing order. 



(5.36) 



5.6.20 Example. Computations show that the (Laplacian) characteristic polyno- 
mial of the graph H in Fig. 5.6.2a is 

det(x/ 5 - L{H)) = x 5 - 12x 4 + 49x 3 - 78x 2 + 40x 
= x(x- l)(x-2)(x-4)(x-5), 

so s(H) = (5,4,2,1,0). □ 
Recall that 

(x - - X 2 ) ■ ■ ■ (x - K) = x n - E x x n - X + ■■■ + (-1)"E„, 

where E r = E r (X\ , X 2 , . . ■ , X n ) is the rth elementary symmetric function. In particu- 
lar, the coefficient of x in the characteristic polynomial det(x/„ — L(G)) is 



E n ^(s(G)) = /?„_i(^(G),X 2 (G), . . .,X n (G)). 



402 



Enumeration in Graphs 



Because L(G) is singular, X„(G) — 0. Therefore, 

«-i 

E n . 1 (s(G)) = l[h(G). (5.37) 
On the other hand, the coefficient of x in det(x/„ — L{G)) is 

^det(L(G),.)=n?(G). (5.38) 
i=i 

5.6.21 Corollary. If G is a graph with Laplacian spectrum s(G) = (^i(G), 
^2(G), . . . , X n (G)) and spanning tree number t{G), then 

«-i 

nt(G) = l[h(G). 
i=i 

In particular, \ n - 1 (G) > 0 if and only if G is connected. 

Proof. The first statement follows from Equations (5.37) and (5.38). The second is 
a consequence of the fact that G has a spanning tree if and only if it is connected. 

■ 

Corollary 5.6.21 suggests that \ n -\(G) might be viewed as a quantitative 
measure of connectivity. 

5.6.22 Definition. If G is a graph, its algebraic connectivity is a(G) = X,„_i(G), 
the second smallest eigenvalue of L(G). 

What about the other eigenvalues? Using an argument similar to the one that 
established Equation (5.32) in Section 5.5, one can show that G\ and G2 are iso- 
morphic if and only if L(G\) and L(G2) are permutation similar. Because sym- 
metric matrices are similar if and only if they have the same eigenvalues, it 
follows that s(G) is an invariant of G. But, what do the eigenvalues of L(G) 
mean graph theoretically? To a large extent, that is still an open question. One thing 
that is known follows from an old result of I. Schur. 

5.6.23 Definition. Suppose (a) = (a\,a2,..., a s ) and (b) = (b\, &2, • ■ ■ , bt) are 
two nonincreasing sequences of real numbers that satisfy a\ + a2 + ■ ■ ■ + a s = 
b\ + b2 H + b t . Then (a) majorizes (b), written (a) >~ (b), if s < t and 

r r 

J2 a i^J2 bi > 1 <'"< 5 - (5-39) 

i=\ i=\ 



The algebraic connectivity was introduced by Miroslav Fiedler. 
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5.6.24 Example. The degree sequence for the graph H in Fig. 5.6.2a is d(H) = 
(4,3,2,2, 1), a partition of 12 = 2m. From Example 5.6.20, s(H) = (5,4,2, 1,0). 
To see that s(H) majorizes d(H), observe that 

5 > 4, 
5 + 4 > 4 + 3, 
5 + 4 + 2>4 + 3 + 2, 
5 + 4 + 2+ l>4 + 3 + 2 + 2, 

and 

5 + 4 + 2+1 + 0 = 4 + 3 + 2 + 2+1. □ 

In fact, Example 5.6.24 is typical. 

5.6.25 (Schur's Majorization) Theorem.* If G is a graph with degree sequence 
d(G) and (Laplacian) spectrum s(G), then s{G) majorizes d(G). 

The proof of Theorem 5.6.25 is beyond the scope of this book. 

Returning to the issue of invariants, graphs G\ and G2 are isomorphic only if 
they have the same chromatic polynomial, the same matching polynomial, the 
same adjacency characteristic polynomial, and the same Laplacian characteristic 
polynomial. While no single one of these polynomials characterizes graphs up to 
isomorphism, might all four, taken in combination, do the job? As shown by Allen 
Schwenk' and Brendan McKay,* the answer is an emphatic negative. 

5.6.26 Theorem. Let P(n) be the probability that given a randomly chosen tree 
T\ on n vertices, there is a nonisomorphic tree T 2 such that, simultaneously, 

(a) p(T u x) =p(T 2 ,x), 

(b) M(T u x) =M(T 2 ,x), 

(c) det(x/„ - A(r,)) = det(xI„-A(T 2 )), and 

(d) det(x/„ - L(r0) = det(x/„ - L(T 2 )). 

Then lim^oc P(n) = 1. 



Theorem 5.6.25 is a special case of a more general theorem published by Issai Schur in 1923. An 
improvement of Theorem 5.6.25 can be found in the article: R. D. Grone, Eigenvalues and the degree 
sequence of graphs, Linear & Multilinear Algebra 39 (1995), 133-136. 

f A. J. Schwenk, Almost all trees are cospectral, in New Directions in the Theory of Graphs, Academic 
Press, New York, 1973, pp. 275-307. 

'B. D. McKay, On the spectral characteristics of trees, Ars Combinatoria 3 (1977), 219-232. 



404 



Enumeration in Graphs 



Proof Sketch. From Theorem 5.3.16, any two trees on n vertices have the same 
chromatic polynomial, namely, x(x — 1)" _1 . By Theorem 5.5.17, parts (b) and 
(c) are equivalent. Thus, it suffices to obtain the conclusion for trees that simulta- 
neously satisfy parts (c) and (d). 

The proof is in two parts. The first is to find a pair of trees, L\ and L2, with ver- 
tices u € V{L\) and w G V(L,2), such that the following property holds: For any tree 
T, and any vertex v of T, if T\ is the tree obtained by identifying vertex u of L\ with 
vertex v, and T2 the tree obtained from T by identifying vertex w of La with vertex 
v, then parts (c) and (d) hold for T\ and T2. Informally, T\ and T2 might be thought 
of as the trees obtained from T by grafting on, at vertex v, limbs isomorphic to L\ 
(at vertex u) and L2 (at vertex w), respectively. 

The second part is to prove that the probability of finding a limb isomorphic to 
Lj (at vertex u), on a randomly chosen w-vertex tree T\ , goes to 1 as n goes to infi- 
nity. It then remains to show that if T2 is the tree obtained from T\ by pruning off 
limb L\ and grafting limb L2 in its place, then T2 is not isomorphic to T\ . ■ 

5.6. EXERCISES 



1 Compute both the number of orientations and the number of acyclic orientations 
of the graph 





(c) 



o- 



6- 



-6 



2 Compute the number of acyclic orientations for the graph G in Fig. 5.6.4. 

1 




Figure 5.6.4 

3 Exhibit the oriented vertex-edge incidence matrix Q = Q(G) for the graph G in 
Fig. 5.6.4 with orientation given by 



(a) e x -(1,2), 
(1,3). 


e2 


= (3,2), 


e 3 


= (3,4), 


e 4 = 


(4,5), 


e 5 


= (5,1), 


and 


(b) e x = (1,2), 
(3,1). 


e2 


= (2,3), 


£3 


= (3,4), 


<?4 = 


(4,5), 


es 


= (1,5), 


and 


(c) ei =(2,3), 


e 2 


= (1,2), 


e 3 


= (1,5), 


e\ = 


(3,1), 


£5 


= (3,4), 


and 



(4,5). 
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Confirm that QQ l = L(G), where G is the graph in Fig. 5.6.4 and Q = Q(G) is 
the oriented vertex-edge incidence matrix from the corresponding part of 
Exercise 3. 

Let G be the graph in Fig. 5.6.4. 

(a) Exhibit the Laplacian matrix L(G). 

(b) Compute two different entries of L(Gf . 

(c) Illustrate all f(G) spanning tress of G. 

Compute the classical adjoint L(G)^ if G is the graph 



(a) 




(b) 




(c) 




7 Compute the Laplacian spectrum s(G) if G is the graph 

OX OO O 



(a) 




(b) 




(c) 




8 M. Fiedler proved that the algebraic connectivity a(G) is at most the vertex 
connectivity k(G) of Section 5.5, Exercise 19. Confirm Fiedler's result for the 
graph 



(a) 




(b) 




(c) 




9 Show that the algebraic connectivity a(T) < 1 for any tree T on n > 2 vertices. 

10 Determine whether 

(a) (7, 7, 3, 2, 1) majorizes (5, 5, 5, 5). (Justify your answer.) 

(b) (5, 5, 4, 2) majorizes (4, 4, 4, 4). (Justify your answer.) 

(c) (6) majorizes (2, 2, 2). (Justify your answer.) 

11 Confirm that s(G) majorizes d(G) for the graph 



(a) 



(b) 



(c) 
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12 Let G be a bipartite graph with m edges. Show that G can be oriented so that 
2(G) t 2(G) =I m +A(G*), where G* is the line graph of G discussed in 
Section 5.5, Exercise 25. 

13 If G is a graph on n vertices, prove that ^,(G C ) + X,„_,-(G) = n, 1 < i < n, i.e., 
prove that the Laplacian spectrum 

s(G c ) = (n - Vi(G),« - V 2 (G), ...,«- ^(G),0). 

14 Let G be a graph with vertex set V(G) = {1,2, ... , n}. Prove that 

XL(G)X l - ^ (x,-x ; ) 2 , 

{U}£E(C) 

where X is the row vector (x\,x 2 , . . . ,x„). 

15 If G is a graph on n vertices, prove that X\ (G) < n, with equality if and only if 
G c is disconnected. 

16 Suppose e — {u,v} is an edge of the graph G=(V,E). Recall that 
G — e = (V,E \ {e}) is the graph obtained from G by deleting edge e, and 
G/e is the graph obtained from G — e by identifying vertices m and v, and 
deleting any multiple edges that may have arisen in the process. Denote by G\e 
the multigraph obtained from G — e by identifying vertices u and v, and 
deleting loops but not multiple edges. If, e.g., G is the graph in Fig. 5.6.5a, 
then G\e is the multigraph in Fig. 5.6.5b. 




(a) (b) 
Figure 5.6.5 



(a) Prove that the spanning tree number t(G) = t(G — e) + t(G\e). 

(b) Use repeated applications of part (a) to evaluate t(G) for the graph in 
Fig. 5.6.2a. 

(c) Use repeated applications of part (a) to evaluate f(G) for the graph in 
Fig. 5.6.4. 

17 If Gi and G2 are graphs on disjoint sets of n\ and «2 vertices, respectively, 
prove that the eigenvalues of L(G\ V G2) are n\ + n-i\ 112 + X,,(Gi), 1 < 2 < n\\ 
n\ + Xi(G2), 1 < i < n 2 ; and 0. 

18 Compute the eigenvalues of L(G) for 

(a) G = K 2a . (b) G = K 2 , 3 . (c) G = K lA . 
{Hint: K %l = K c s V K*. Use Exercise 17.) 
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19 Confirm Corollary 5.6.21 for 

(a)G = K 2 , 2 . (b)G = K 2 , 3 . (c)G = K lA . 

20 Prove that the Laplacian spectrum s(K n ) = (n, «,..., n, 0). 

21 Let H = P 4 . 

(a) Compute s(H). 

(b) Confirm Corollary 5.6.21 for H. 

(c) Prove or disprove that, for any graph G, the Laplacian spectrum s(G) 
consists entirely of integers. 

22 Let G and H be the graphs in Fig. 5.6.6. Show that 




Figure 5.6.6 



(a) G and H are not isomorphic. 

(b) det(x/ 6 - L{G)) = x{x -2)(x- 3) 2 (x 2 -6x+ 4). 

(c) det(x/ 6 - L{H)) = x{x -2)(x- 3) 2 (x 2 -6x + 4). 
23 Let G and H be the graphs in Fig. 5.6.7. 




Figure 5.6.7 



(a) Compute the Laplacian spectrum s(G). 

(b) Compute s(G c ). 

(c) Compute s(H) . 

(d) Compute s(H c ). 

(e) Show that the union G + G c is not isomorphic to H + H c . 

(f) Show that the join G V G c is not isomorphic to H V H c . 

(g) Compute s(G + G c ). 

(h) Show that s(H + H c )= s(G + G c ). 

(i) Show that s(H V H c ) = s(G V G c ). 
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5.7. GRAPHIC PARTITIONS 



Luck is the residue of design. 



— Branch Rickey 



Suppose 71= [7Ci,Ji2, • • • ,ite] ^~ k. Under what conditions will n be the degree 
sequence of some graph? 

5.7.1 Definition. Partition n is graphic if there exists a graph G with degree 
sequence d(G) = n. 

Because the parts of a partition must be positive, but graphs can have isolated 
vertices of degree 0, not every degree sequence is a graphic partition. However, the 
degree sequence of any graph can be obtained from some graphic partition by 
appending finitely many zeros. 

An obvious necessary condition for n h k to be graphic emerges from the first 
theorem of graph theory, namely, k must be even. Almost as obvious is the neces- 
sary condition that I = %\ > 7ti + 1, where n* = [jc*, n^, . . .] is the partition conju- 
gate to n. In a graph with n* vertices of positive degree, Tti (the maximum vertex 
degree) can be no more than n* — 1. In fact, this second criterion can be extended. 
To see how, suppose G is the graph illustrated in Fig. 5.7.1, with vertex set 
V(G) = {1,2, ...,6} and degree sequence d(G) = n = (5,3 2 ,2 2 , 1). 

A Young tableau* is a variation on a Ferrers diagram in which the boxes contain 
numbers. In Fig. 5.7.2a, e.g., every box in row i of F(n) contains vertex number i, 
1 < i < 6. In Fig. 1.6.2b, the boxes in row i of F(n) contain, in increasing order, the 
numbers of the vertices adjacent in G to vertex i, 1 < i < 6. Note that, in addition to 
having the same shape, the two tableaux contain the same integers with the same 
multiplicities. While it is framed in the context of this example, the discussion that 
follows remains valid for any graphic partition. 

Consider the tableau in Fig. 5.1.2b. Because the numbers in each row are 
arranged in increasing order, the first column contains all the l's. Moreover, 
because vertex 1 is not adjacent to itself, the top entry of column 1 contains a num- 
ber larger than 1. Thus, we recover the second criterion for n to be graphic, namely, 

Ttj > TCi + 1. 



4 

Q- 




1 



6 

■O 



a 

2 



-d- 

3 



-O 
5 



Figure 5.7.1 



*Named for Alfred Young (1873-1940). 
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1 


1 


1 


00 


2 


3 


4 


0S 


2 


2 


2 


1 


3 


4 




3 


3 


3 




1 


2 


~5~ 




4 


4 








2 




5 


5 








3 




6 















(a) 



(6) 



Figure 5.7.2 



Continuing with the tableau in Fig. 5.1.2b, all the 2's must lie in the first two 
columns. Moreover, because the first number in row 1 is at least 2, the second num- 
ber in row 1 (i.e., the top number in column 2) must be at least 3. Indeed, since it 
cannot be 2, the second number in the second row (i.e., the second number in 
column 2) also cannot be less than 3. In addition to all the l's and all the 2's, 
the first two columns of the second tableau must contain (at least) two numbers lar- 
ger than 2. Hence, n\ + Ttjj > K\ + K2 + 2. 

As long as n r > r, this same approach proves that 



n\ + n* 2 H h n* > tti + 7i 2 H \-n r + r 

= (Jti + 1) + (7i 2 + 1) + ■ ■ ■ + {n r + 1). 



(5.40) 



Let's give a name to the number of parts of n that satisfy n r > r. 

5.7.2 Definition. If n h k, the trace of % is/(rc) = o({r : n r > r}). 

Geometrically, /(71) is the length of the diagonal of F(n). To make them easier 
to recognize, the diagonal boxes of the Ferrers diagram for x = [5, 4, 3 2 , 2, 1] have 
been darkened in Fig. 5.7.3. Note, in particular, that F(x) is completely determined 
by its first /(x) rows and columns. 



□□■ 

□□□ 

□□ 
□ 



Figure 5.7.3 
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6 or 



5 O 



1 



o 

4 
G, 

(a) 



2 



O 3 



6 Q 



5 O 




P 2 



O 3 



(b) 



Figure 5.7.4 



5.7.3 (Ruch-Gutman) Theorem.* Let n = [711,712, ... , He] be a partition of 2m 
for some positive integer m Then n is graphic if and only if 

r r 

5>;>5>,- + l), l<r </(«). (5.41) 
7=1 j=i 

While they may seem complicated and technical, Inequalities (5.41) are the 
same necessary conditions for n to be graphic as those expressed by Inequalities 
(5.40). Before addressing sufficiency, we will give some examples and discuss an 
alternative presentation, due to Tom Roby, 1 ^ that may be more appealing. 

5.7.4 Example. Consider the partition x = [5,4,3,3,2, 1]. Because x h 18, the 
first condition of Theorem 5.7.3 is satisfied (m = 9). From Fig. 5.7.3, it is easy 
to see that x* = [6, 5, 4, 2, 1]. Because x* = Zj + 1, 1 < j < 3 =/(x), equality holds 
in each of Inequalities (5.41). 

In this case, it is easy to construct a graph having degree sequence x. Draw six 
points in the plane and label them 1 , 2, . . . , 6. Drawing arcs from vertex 1 to each of 
vertices 2-6 results in the graph G\, illustrated in Fig. 5.7.4a, whose largest vertex 
degree is X\ = 5. 

Joining vertex 2 to vertices 3, 4, and 5 results in the graph G2 shown in 
Fig. 5.7 Ab. Note that the first two components of d(G2) = (5,4,2,2,2,1) are 
X! = 5 and x 2 = 4. So far, so good. To obtain a graph that realizes x i.e., a graph 
G with degree sequence d(G) — x, it remains to add an arc between vertices 3 and 4 
of G 2 . 

What about taking this same greedy approach with, say, y = [3 6 ] ? With f(y) = 3 
and y* = [6 3 ], it is easy to see that Inequalities (5.41) are satisfied. So, as before, 
label six points in the plane with the numbers 1-6. Joining vertex 1 to vertices 2-4 

'Theorem 5.7.3 seems first to have been published by E. Ruch and I. Gutman, The branching extent of 
graphs, /. Combin. Inform. System Sci. 4 (1979), 285-295. Also see W. Hasselbarth, Die Verzweigtheit 
von Graphen, Commun. Math. Computer Chem. (MATCH) 16 (1984), 3-17. 
+ Tom Roby is a professor at California State University, Hayward. 
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Figure 5.7.5 



results in a graph G\ having degree sequence d{G\) — (3, 1, 1, 1,0,0), the first 
coordinate of which is 3 = y,. Graph G2 with degree sequence d(G2) = 
(3,3,2,2,0,0) is obtained from G\ by adding arcs from vertex 2 to vertices 3 
and 4. Finally, adding an arc between vertices 3 and 4 results in the graph G3, illu- 
strated in Fig. 5.7.5a, with degree sequence d(G^) = (3,3,3,3,0,0). So far, so 
good. However, as a moment's reflection shows, no graph realizing y can be 
obtained from G3 by adding more arcs! (A graph that does realize y can be found 
in Fig. 5.1.5b.) □ 



5. 7. 5 Definition. 

threshold partition. 



Suppose i h 2m. If x? = x ; - + 1, l<j<f(x), then x is a 



Coming to the promised alternative presentation of the Ruch-Gutman criteria, 
suppose n h k. Denote that portion of F(n) consisting of the boxes on or to the right 
of its diagonal by R(n). Let B(ri) be what's left, i.e., the boxes below the diagonal. If 
n = [4, 3 2 , 2 2 , l 2 ], e.g., this division of F(n) is illustrated in Fig. 5.7.6 (where diag- 
onal boxes have again been darkened to facilitate their easy recognition). 

5.7.6 Definition. Suppose 7i h k. Let p(7r) be the partition whose parts are the 
lengths of the rows of the shifted shape R(n). Denote by P(tt) the partition whose 
parts are the lengths of the columns of B(n). 



F(n) = 



□■□ 
□□■ 

□□ 
□□ 
□ 
□ 



□ 

□□ 

□□ 

□□ 

□ 

□ 



= R(n) 



Figure 5.7.6 



* While the greedy approach does not work in all cases, it does work whenever 71 h 2m satisfies 
Tlj = Ttj+ 1, 1 <j <f(n). See Exercise 11 (below). 
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If 7i = [4,3 2 ,2 2 , l 2 ] then, from Fig. 5.7.6, p(n) = [4,2, 1] and P(jx) = [6,3]. 
Observe, in general, that shifted shapes R(n) and B(n) can be the pieces of such 
a division of F(n), only if /(re) - 1 < £(P(rc)) </(tc) = ^(p(Ji)). 

5.7.7 Definition. If (a) = (ai,a2i • • • ,Os) an d (b) = (bi,b2, ■ ■ ■ ,b t ) are two 
nonincreasing sequences of real numbers, then (a) weakly majorizes (b), written 
(a)k (fc),iff>5, 

r r 

l<r<j, (5.42) 

1 = 1 !=1 

and 

s f 

j>>j> (5.43) 

i=l i=l 

Evidently, (a) majorizes (b) if and only if (a) ^ (&), with equality in Inequality 
(5.43). With the appearance of Definition 5.7.7 we finally have the vocabulary we 
needed to state Roby's elegant variation on the Ruch-Gutman criteria. 

5.7.8 Theorem. If n h 2m, then n is graphic if and only if P(tc) weakly 
majorizes p(n). 

To see that Theorems 5.7.3 and 5.7.8 are equivalent, observe that p[ > P[ if and 
only if Ttj — 1 > Tti, if and only if tc* > K\ + 1; Pi + P2 > Pi + P2 if an d onr y if 
(n\ - 1) + (n* 2 -2) > jci + (ji 2 - 1), if and only if Ttj + n* 2 > (m + 1) + 
(712 + 1); and so on. Notice that equality holds throughout Inequalities (5.41), if 
and only if n is a threshold partition, if and only if n* = Tt ( - + 1, 1 < i <f(n), if 
and only if P(tc) = p(n). Let's formalize this observation for future reference. 

5.7.9 Corollary. Partition n is a threshold partition if and only !/P(it) = p(^)- 

However they may be stated, the proof that the Ruch-Gutman criteria are 
sufficient for 71 h 2m to be graphic begins with the following. 

5.7.10 Lemma. Ifx h 2m is a threshold partition, then x is a graphic partition. 

5.7.11 Example. Consider the partition x = [5, 4, 3 2 , 2, 1] in Example 5.7.4. 
From the division of F(x) illustrated in Fig. 5.1 .la (with no boxes darkened), it 
is easy to see that B(x) is the transpose of R(x), so p(x) = p(x), i.e., x is a threshold 
partition. 

Observe that the symmetric, £(x) x ^(x), (0, l)-matrix A(x) in Fig. 5.1.1b, 
obtained from Fig. 5.1.1a by replacing boxes with l's and spaces with 0's, is the 
adjacency matrix of a graph with degree sequence x. □ 
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□ 



A(x) 



Figure 5.7.7 



Proof of Lemma 5.7.10. As in Example 5.7.11, let A(x) = (ay) be the ^(x)-square 
matrix denned by ay = 0 if i = j or if i < j and x,- + 1 < j; ay = 1 if i < j < x,- + 1 ; 
and fly = ajj if i > j. Then A(x) is the adjacency matrix of a graph realizing x. 

■ 

Sufficiency of the Ruch-Gutman criteria: The proof of sufficiency can be 
reduced to Lemma 5.7.10 in two steps. The first is to show that if n is majorized 
(that's right, not weakly majorized, but majorized) by a graphic partition, then n is 
graphic. The second is to show that any partition that satisfies Inequalities (5.41) is 
majorized by a threshold partition. Details are omitted. 

5.7.12 Example. While any two partitions of k are majorization comparable 
when k < 5, neither [3 2 ] nor [4, l 2 ] majorizes the other. The majorization partial 
order for the 11 partitions of 6 is illustrated by the so-called Hasse diagram in 
Fig. 5.7.8, where the graphic partitions have been darkened. Observe that the 
threshold partitions [2 3 ] and [3, l 3 ] are maximal among the graphic partitions. □ 

5.7.13 Definition. A threshold graph is one whose degree sequence is, apart 
from 0's, a threshold partition. 

Many interesting things are known about threshold graphs, a few of which are 
listed below. 

5.7.14 Theorem. A threshold graph is uniquely determined by its degree 
sequence, i.e., two threshold graphs are isomorphic if and only if they have the 
same degree sequence. 

It follows from Theorem 5.7.14 that there is a one-to-one correspondence 
between the threshold graphs with m edges and no isolated vertices, and the 



'Further details can be found, e.g., in R. Merris, Graph Theory, Wiley-Interscience, New York, 2001. 
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Figure 5.7.8. Partitions of 6 partially ordered by majorization. 
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partitions x h 2m that satisfy p(x) = P(x), i.e., that satisfy R(x) = B(x)'. In other 
words, there is a one-to-one correspondence between the threshold graphs with 
m edges and no isolated vertices, and the shifted shape partitions of m. But the 
shifted shape partitions are precisely the partitions having distinct parts. In view 
of Example 4.3.10, this proves the following. 

5.7.15 Corollary. The number of nonisomorphic threshold graphs with m edges 
and no isolated vertices is the coefficient of x m in the generating function 

J|(l +x>) = 1 +x + x 2 + 2x 3 + 2x 4 + 3x 5 + 4x 6 + • • • 

j>i 

5.7.16 Example. The 12 nonisomorphic connected threshold graphs with 
2 < m < 6 edges are illustrated in Fig. 5.7.9. □ 




(J) (k) (I) 

Figure 5.7.9 
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Finally, there is an interesting characterization of threshold graphs by means of 
Laplacian spectra. 

5.7.17 (Merris's) Theorem.* Let G be a graph on n vertices, none of which is 
isolated (of degree 0). Then G is a threshold graph if and only if the conjugate 
of its degree sequence is equal to \k\{G), ^(G), . . . , X„-\(G)], where s(G) = 
(ki(G), X^(G), ■ ■ • , X„(G)) is the Laplacian spectrum of G. 

While not especially difficult, the proof of Theorem 5.7.17 is beyond the scope 
of this text. 



5.7. EXERCISES 

1 Which of the following sequences weakly majorizes (2.5, 1.5, 1)? Justify your 
answer. 

(a) (3, 1). (b) (3, 2). (c) (3, 3). 

(d) (4, 1). (e) (5, 1). (D (6, 1). 

2 Exhibit F(n), R(n), B(n), and /(ft) for the partition 
(a) k = [6 2 ,2 2 }. (b) tt=[6 2 ,3 4 ,2]. 
(c) 7i= [7,6,5 2 ,4 2 ,2,1]. (d)7r=[4 5 ]. 

(e) n - [5 4 ]. (f) n = [2% 
(g) tt=[3,2 2 ,1 2 ]. (h) k = [2, l 5 ]. 

3 Which of the partitions in Exercise 2 
(a) is graphic? (b) is threshold? 
(Justify your answers.) 

4 Exhibit a graph with degree sequence 

(a) (4,4,3,2,2,1). (b) (5,5,3,3,3,3). 

5 Exhibit two nonisomorphic graphs, both having degree sequence 
(a) (2,2,2,2,2,2). (b) (3,3,3,3,3,3). 

6 Graph G is r-regular if dc(v) = r for all v G V(G). Prove that 

(a) 7i = [/■"] is graphic if 1 < r < n and the product rx/iis even. 

(b) 71 = [r"\ is threshold if and only if r = n — 1. 



*R. Merris, Degree maximal graphs are Laplacian integral, Linear Algebra Appl. 199 (1994), 381-389. 
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7 Exhibit graphs whose degree sequences match the five graphic partitions of 
Fig. 5.7.8. 

8 A graph on n vertices is antiregular if its multiset of vertex degrees contains 
n — 1 different numbers. (See, e.g., Example 5.7.4.) 

(a) Illustrate the nonisomorphic antiregular graphs on five vertices. 

(b) If G is a connected antiregular graph on n vertices, show that there exist 
two vertices u, w € V(G) such that d c (u) — d G (w). 

(c) Show that the common value of dc(u) and dc(w) in part (a) is 
\(n — l)/2], the integer obtained from (n — l)/2 by rounding up. 

(d) Prove that every connected antiregular graph is a threshold graph. 

9 Prove that ir* majorizes jr for every graphic partition jr. 

10 Confirm Theorem 5.7.14 by proving independently that, up to isomorphism, 
there is just one graph with degree sequence 



11 Design an algorithm to input a threshold partition x, and return a (threshold) 
graph G satisfying d(G) — x. 

12 Let k = [5 2 ,3 4 ]. Show that 

(a) n is a graphic partition. 

(b) n is not a threshold partition. 

(c) up to isomorphism, there is a unique graph having degree sequence jr. 

13 Confirm Theorem 5.7.17 for the graph G in 
(a) Fig. 5.7.9d. (b) Fig. 5J.9e. 

(c) Fig. 5.7.9/j. (d) Fig. 5.7.9/. 

14 If G is a threshold graph, then the chromatic number %(G) = co(G), the size of 
a largest clique. Confirm this result for the graph in 

(a) Fig. 5.7.9d. (b) Fig. 5J.9e. 

(c) Fig. 5.7.9/j. (d) Fig. 5.7.9/. 

15 If G = (V, E) is a threshold graph, there exists an integer t and an integer- 
valued function / of V such that {u, v} £ E if and only if f(u) +/(v) > t. 
Confirm this result by finding / and t for the graph in 

(a) Fig. 5.7.9d. (b) Fig. 5.7.9e. 

(c) Fig. 5.7.9ft. (d) Fig. 5.7.9/. 



(a) [5 2 ,2 4 ]. 



(b) [5 : 




(c) [5, 4, 3 2 , 2,1]. 



16 



If G—(V,E) is a threshold graph, there exists an integer t and a positive 
integer-valued function / of V such that X C V is an independent set of 
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vertices if and only if J^uex f( u ) — l - Confirm this result by finding/ and t for 
the graph in 

(a) Fig. 5.7.SW. (b) Fig. 5.7.9e. 
(c) Fig. 5.7.9/1. (d) Fig. 5.7.9/. 

17 The function / in Exercise 16 is called a threshold labeling. In many cases, 
labeling vertices by their degrees produces a threshold labeling. 

(a) Show that labeling the vertices of the graph in Fig. 5.7.9/ by its vertex 
degrees is not a threshold labeling. 

(b) Find a threshold labeling for the graph in Fig. 5.7.9/ 

18 It is known that G is a threshold graph if and only if it does not contain an 
induced subgraph isomorphic to one of the three forbidden graphs P4, C4, or 
K2 + K2. Show that none of these forbidden graphs is a threshold graph. 

19 A split graph is one whose vertex set can be partitioned into a clique and an 
independent set. It is known that G is a split graph if and only if it does not 
have an induced subgraph isomorphic to one of the three graphs C4, C5, or 
K2 + K2. Prove that every threshold graph is a split graph. 

20 Prove that the Laplacian spectrum of a threshold graph consists entirely of 
integers. 

21 Find a connected, nonthreshold graph G whose Laplacian spectrum consists 
entirely of integers. 

22 Let G be a threshold graph on n vertices. Prove that G either has a vertex of 
degree 0 or a vertex of degree n — 1 . 

23 Confirm the result you obtained in Exercise 19 for the graph in 
(a) Fig. 5.7.9d. (b) Fig. 5.7.9e. 

(c) Fig. 5.7.9/1. (d) Fig. 5.7.9/ 

24 Let G be a threshold graph. Suppose {u, v} G E(G). If x,y G V(G) satisfy 
d(x) > d(u) and d(y) > d(v), prove that {x, y} G E. 

25 It is known that there are exactly 2"~ 2 nonisomorphic connected threshold 
graphs on n vertices. When n = 5, three of the eight are exhibited in Fig. 5.7.9. 
Illustrate the other five. 

26 Graph G is an interval graph if there is a one-to-one function / from V(G) into 
the family of open intervals of the real line such that {u, v} G E(G) if and only 
if/(w) l~l/(v) ^ 0. It is known that every threshold graph is an interval graph. 
Confirm this for the graph G in 

(a) Fig. 5.1. 9d. (b) Fig. 5.7.9e. 

(c) Fig. 5.7.9ft. (d) Fig. 5.7.9/. 



5.7. Exercises 



419 



27 Let G be a graph with vertex set V = {vi, V2, • • • , v„} and edge set E = 
{e\, e2, ■ ■ ■ , e m }. The n x m incidence matrix T{G) — (ty) is defined by ty = 1 
if v, is incident with ej, and ty otherwise. 

(a) Show that ty = \qy\, 1 < i < n, 1 <j < m, where Q(G) — (qy) is 
the oriented vertex-edge incidence matrix afforded by some arbitrary 
orientation of G. 

(b) If G is r-regular (Exercise 6), show that n = 2m/ r. 

(c) If G is r-regular, r > 1, show that the triple of parameters for the binary 
code e £ comprised of the rows of T(G) is (m, 2m/ r, 2r — 2). 
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While this chapter is independent of Chapters 3-5, Section 1.4 is an essential 
prerequisite for Sections 6.1, 6.2, and 6.4. 

In 1455, Johann Gutenberg printed what is commonly believed to have been the 
first book set in movable type. By making information widely accessible, this tech- 
nical innovation profoundly influenced the development of human civilization for 
the next 500 years. Indeed, the next leap of comparable magnitude did not take 
place until 1946, when civilian scientists began to think of information as strings 
of 0's and l's. 

Launched March 2, 1972, Pioneer 10 was the first spacecraft to travel through 
the asteroid belt. After a rendezvous with Jupiter in December, 1973, Pioneer con- 
tinued downstream through the heliomagnetosphere, passing the orbit of Pluto in 
1983. On March 2, 2002, 30 years after its launch, 5 years after its scientific mission 
ended, and 22 hours after a message was beamed to it from a NASA facility in the 
Mojave Desert, a 10~ 20 -watt signal was received from the spacecraft by a radio 
telescope in Spain. The fact that an identifiable signal could be detected at all is 
an engineering triumph of the first magnitude. The fact that the message carried 
by the signal was decipherable, despite distance and background noise, is a tri- 
umph for the mathematical theory of error-correcting codes, the defining topic of 
this chapter. 

Apart from the pictures themselves, one of the most dramatic things about 
photographs from the early Pioneer, Voyager, and Mariner missions was their 
emergence, one pixel at a time, on the big screen of the Jet Propulsion Laboratory 
as the transmissions from space were decoded in real time. This achievement was 
made possible by a combination of the fastest digital computers then available and a 
fast algorithm for decoding messages, the topic of Sections 6.1 and 6.2. 

Applications of error-correcting codes in telecommunications have driven 
renewed interest in a beautiful areas of combinatorics that deals with relationships 
between numerical constraints and geometric configurations. Mutually orthogonal 

'Pioneer 10 had, by then, ceased to be the most distant man-made object. On February 17, 1998, it was 
surpassed by Voyager 1, headed in the opposite direction, upstream toward the nose of the helio- 
magnetosphere. 

Combinatorics, Second Edition, by Russell Merris. 
ISBN 0-471-26296-X © 2003 John Wiley & Sons, Inc. 
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Latin squares and their connection with finite projective planes are the subject of 
Section 6.3. this section is independent of Sections 6.1 and 6.2 and may be used, all 
by itself, as an optional excursion at any point during the course. On the other 
hand, some readers may wish to exit from Chapter 6, either immediately after 
Example 6.2.9, or at the end of Section 6.2. 

Applications of finite projective planes is through their (0,l)-incidence matrices, 
motivating the generalization to balanced incomplete block designs (BIBDs). 
Section 6.4 is a brief introduction to the existence of BIBDs. 



6.1. LINEAR CODES 

Recall from Section 1 .4 that an (n, M, d) code % is a nonempty set of M binary 
words of length n, the minimum distance between any pair of which is d. 
Nearest-neighbor decoding refers to a process by which an erroneous binary 
word b is corrected to a legitimate codeword c such that 

d(b, c) = min d(b, w). 

MEW 

An (n, M, d) code can reliably detect as many as d — 1 errors. Using nearest- 
neighbor decoding, it can reliably correct as many as r = [(d — 1)/2J. 

So far, so good, but what about a practical processl Given a binary word b, how, 
exactly, does one go about finding a codeword w that minimizes d(b,w)7 When 

= {00000, 11111}, that isn't much of a problem. When M is large, however, it 
may not be easy to find the smallest number in the M-element set 
{d(b, w) : w <G r £}, much less compute d(b,w) for every w G r £, much less do 
these things for every word in a long message, much less do it in real time! Among 
the many virtues of linear codes is a fast, efficient process for nearest-neighbor 
decoding. 

Our discussion of linear codes begins with the notion of Boolean arithmetic* 
Recall that a binary code is a subset of F n , i.e., a set of w-bit words assembled 
from the alphabet F = {0, 1}. In these statements, 0 and 1 are viewed as letters. 
However, it can be useful to view them as numbers. The distinction involves arith- 
metic — of a sort. Boolean addition and multiplication are defined for the elements 
of F by means of the tables in Fig. 6.1.1. 
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1 


0 


1 



Figure 6.1.1. Boolean arithmetic. 
*Named for George Boole (1815-1864). 

+ Boolean arithmetic makes F = {0, 1} a "field of characteristic 2." Having no need for the theory behind 
these words, we will avoid using them. 
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While Boolean multiplication is identical to ordinary multiplication, Boolean 
addition differs from ordinary addition in one important way, namely, 1 + 1=0. 
In effect, this makes +1 and —1 the same, which makes addition the same as 
subtraction! Boolean addition is extended to F", bit by bit. 

6.1.1 Example. In F 3 , e.g., 101+011 = 110,010+111=101, 110+111 = 
001, and 1 10 + 1 10 = 000. Indeed, w + w = 000 for all w e F 3 . Boolean addition 
of words from F" is both commutative and associative. □ 

The first important application of Boolean arithmetic requires the following idea. 

6.7.2 Definition. The weight of a binary word u, denoted wt(w), is the number of 
l's in u. 

For example, wt(OlOHOOO) = 3 and wt(10111001) = 5. 

6.1.3 Theorem. The distance between binary words u,v £ F n is the weight of 
their sum, i.e., d(u, v) = wt(u + v). 

Proof. If u — u\U2---u n and v = v\V2---v n then w,-, the ith bit of u + v = 
w\wi---w n , is 1 if and only if u, • ^ v,-. (See the addition table in Fig. 6.1.1.) 
Thus, wt(w + v) counts the number of places in which u and v differ. ■ 

6.1.4 Definition. A binary code c £ is linear if the sum of any two codewords is 
another codeword, i.e., if u + v G for all u,v£f. 

6.7.5 Example. Among the linear codes is F", the set of all 2" binary words of 
length n. The code r £ x = {000, 100, 001, 101} is linear, but the code 
S= {101,010, 111} is not. While it is true that the sums 101 + 010=111, 
101 + 111 = 010, and 010+ 111 = 101 are all elements of S, 101 + 101 = 000 
is not. The smallest linear code containing S is <^ 2 = {000, 101,010, 111}. □ 

It is clear from Example 6.1.5 that every linear code contains the binary word 
each of whose bits is 0. When n is understood, the zero word 00 ... 0 S F" is 
denoted 0. This usage introduces an obvious ambiguity. Whether the symbol 0 is 
to be understood as a single bit or as a binary word of length n will have to be 
discerned from the context. 

Why introduce deliberate ambiguities? Why not use, e.g., z to denote the zero 
word? It is because the zero word plays the role of zero in the sense that b + 0 = b 
for every binary word b S F". Indeed, with respect to the "scalar multiplication" of 
binary words defined by 

lb = b and 0/7 = 0, (6.1) 

F n is a vector space, with binary words playing the role of vectors, and F = {0, 1} 
playing the role of the "scalar field" (where the arithmetic is Boolean). While bi- 
nary codes are subsets of F", linear codes are subspaces of F n , i.e., is a linear 
code if and only if it is a (Boolean) vector space! 
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6.1.6 Definition. If S is a nonempty subset of F", the subspace generated (or 
spanned) by S is the linear code JS? (S) consisting of all (Boolean) linear combina- 
tions of binary words from S. 

As in "ordinary" linear algebra,* S'(S) is the intersection of all the linear codes 
(subspaces of F") that contain S. In particular (when 0 =^ S C F n ), S C with 
equality if and only if S is a linear code. 

Recall that a minimal generating set is a basis.' 1 (Evidently, S contains a basis of 
if (5).) If B = {wi, M2, • • • , fit} is a basis of r to, then every codeword w G ^ is 
uniquely expressible as a linear combination 

W = fllM] + a 2 «2 H (6.2) 

where a, G F, 1 < i < k. since each a, is either 0 or 1, each codeword w is a simple 
sum of basis vectors. Selecting a codeword w by specifying the coefficients in equa- 
tion (6.2) is equivalent to making a sequence of k decisions, each having two 
choices, i.e., o(f€) = 2 k , so W is a (n,2 k ,d) code. § 

6.1.7 Corollary. If is an (n,2 k 7 d) linear code, then d is the minimum of the 
weights of the nonzero codewords of€, i.e., 

d = min wt(w). 

To determine d for an M-word code generally requires computing and comparing 
C (M, 2) distances. For linear codes, Corollary 6.1.7 reduces the chore by a factor 
of 2/M. 

Proof of Corollary 6.1.7. Since u + v = 0 if and only if u = v, it follows from 
Theorem 6.1.3 that 

d = min d(u, v) > min wt(vv) . 

u ,ve(f wen 

The reverse inequality is a consequence of the fact that d < d(w, 0) = wt(w + 0) = 
wt(w) whenever 0 / w £ 1. ■ 

While the notion of a scalar ("dot") product has the obvious Boolean analog, its 
interpretation is a little different. If u = U\U2 • • • u n and v = V\V2 • • • v„ are binary 
words of length n then 

u ■ v = uiv\ + m 2 V2 H h«„v„. (6.3) 

*For example, where the scalars come from the real number field K. 

f A basis of ^ is a linearly independent set B of vectors such that '€ — SC(B). 

'Here w, G c € is a codeword of length n, not the ith bit of some binary word u. 

§ Some authors use (n, k, d) to denote the parameters of a linear code. The original (n, 2 k , d) notation will 
be retained in this book. 
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1 1 2 

Whereas in ordinary linear algebra, u ■ u = \\u\\ is the square of the magnitude of u, 
in Boolean linear algebra, u ■ u is the "parity" of u. 

6.1.8 Definition. The parity of a binary word w is 1 if the weight of w is odd, 
and 0 if wt(vv) is even. 

6.1.9 Example. Consider the (3,4,1) linear code c £ = {000, 001, 100, 101}. 
With minimum distance d=\, ( 6 cannot (realiably) detect, much less correct, 
even a single transmission error. One way to "fix" this deficiency is by repetition, 
e.g., by sending each message twice. This can be done in two rather different ways. 
If, e.g., the message is 000-100, repetition could take the form 000-100-000-100, 
where the message is followed by a duplicate message. An alternative would be 
to duplicate each word of the message as it is sent, resulting in 000-000-100- 
100. This alternative is equivalent to sending 000000-100100, i.e., to sending 
each word once, using a different code. The "repetition" code 
%W = {000000, 001001, 100100, 101101} is obtained by replacing each codeword 
xyz € ( £ with the concatenated word xyzxyz G . 

Because addition of binary words is bitwise, the linearity of "t^ 2 ' is an immediate 
consequence of the linearity of c €. In particular, Corollary 6.1.7 may be used to 
determine that the minimum distance between any two codewords of ^ 2 > is 
d = 2. Thus, "r^ 2 ' a (6,4,2) code capable of detecting (single) errors, thus "fixing" 
the deficiency of the original code c €. 

Here is an alternative to Denote by = {0000,0011, 1001, 1010} the 
code obtained from r £ by adding a single parity check bit to the end of each 
word, i.e., by replacing xyz G % with xyzp £ ( $ + , where p is the parity of xyz. 
Because ^ + is a linear code (The proof is left to the exercises, but why wait?), 
Corollary 6.1.7 can be used to deduce that r £ + is a (4,4,2) code capable of detecting 
(single) errors. Thus, r £ + also fixes deficiency. 

Because its codewords are shorter, r £ + has an obvious advantage over in 
efficiency (and speed). The concatenation idea, on the other hand, seems to have 
an advantage over the parity check bit idea because it can be extended to obtain, 
e.g., a (9,4,3) (linear) code "if' 3 '. Because every codeword in ^ + has even weight 
(parity 0), passing to ^ ++ increases the length of the code without increasing the 
minimum distance between codewords. The obvious extension of the idea that led 
to %> + is useless. There are, however, more subtle extensions of the parity check bit 
idea that hold enormous power. While these extensions will not be fully developed 
until Section 6.2, they begin with the innocent observation that 

xyzp -1111=0 (6.4) 

for all xyzp □ 

In ordinary linear algebra, the scalar product u ■ v = 0, if and only if u and v are 
orthogonal. It is convenient to use this same terminology in Boolean linear algebra. 
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6.1.10 Definition. Binary words u,v G F n are orthogonal if u ■ v = 0. 

Equation (6.4) gives a necessary condition for a binary word w to belong to the 
parity check bit code r € + of Example 6.1.9, namely, w 1111 =0. Because, e.g., 
w = 0110 is orthogonal to 1111 but w £ this necessary condition is not suffi- 
cient. The key to the subtle but powerful extensions of the parity check bit idea 
entails orthogonality conditions that are both necessary and sufficient. (In the 
case of Example 6.1.9, w G if and only if w ■ 1111 = 0 and w ■ 0100 = 0.) 

6.1.11 Definition. The orthogonal complement of a nonempty subset S C F n is 
the set S L = {w G F" : u ■ w = 0 for all u G 5}. 

6.1.12 Example. Because orthogonality has more to do with parity than perpen- 
dicularity, care should be taken with this concept, e.g., if S = {000, 101,010}, then 
s x = {000, 101} C s. □ 

6.1.13 Theorem. If S is a nonempty subset of F", then S is a linear code. 

Proof. Because u ■ 0 = 0 for u e S, OeS 1 . If v, w G 5 X then, for all «eX, 
m-(v + w) = m- v + m- w = 0 + 0 = 0, sov + we 5 X . ■ 

Since u • w = w ■ u, S C S LL . Because S ±± is a linear code and JS?(5) is the inter- 
section of all linear codes containing S, it is evidently the case that £P(S) C S LL . 

6.1.14 Theorem. If0^ScF", then £C{S) = S LL . 

The proof of Theorem 6.1.14 will occupy us for the rest of this section. Before 
getting to the details, let's discuss some implications. 

6.1.15 Corollary. If ^ is a linear code, then r £ — 

Proof If <$ is a linear code, then <«? = if (#) = ^ x± . ■ 

6.1.16 Definition. The dual of a linear code ( £ is the linear code 

By Corollary 6.1.15, (^ x ) x = <^ xx = <g, i.e., the dual of ^ x is <€. It seems that 
every linear code is paired with a unique (linear) dual. 

Every bit as interesting is the case in which c £ = 5 X for some nonlinear code S. 
By Theorem 6.1.13, ^ is a linear code. By Theorem 6.1.14 and the definitions, the 
dual of <€ is 5 XX = if (5). Finally, the dual of if (5) is <€ = S L , i.e., 

£?{S) L = S L . (6.5) 



6.7.27 Example. Let's return to Example 6.1.12, where S = {000, 101, 010} C 
F 3 . Because B = {101, 010} is linearly independent, it is a basis of if (5). So, any 
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codeword in ^C(S) is uniquely expressible in the form a\ 101 + a 2 010, where 
a\, ai £ F = {0, 1}. It follows, as in Equation (6.2), that ^C(S) contains 
2x2 = 4 codewords, three of which are already in S. The "missing" word is 
111, corresponding to a\ = a 2 = 1, i.e., 

if(5) = {000, 101,010, 111}. 

From Equation (6.5) and Example 6.1.12, J2?(S) X = S L = {000, 101}. Now, 
despite the fact that £C{S) L C J£(S), it is nevertheless the case (as in ordinary 
linear algebra) that dim(jS?(S)) + dim(if (S) L ) =2+1 = 3 = dim(F 2 ). □ 

The key to proving Theorem 6.1.14 is the following extension of the last obser- 
vation from Example 6.1.17. 

6.1.18 Lemma. If ( €dF n isa linear code, then 

dim(^) + dim(^ x ) = n. (6.6) 

Before embarking on the somewhat technical proof of Lemma 6.1.18, let's see 
another example. 

6.1.19 Example. Let B = { 1 1010, 01 101 , 01 1 10}. We claim that B is linearly 
independent. To prove it, observe that the (vector) equation 

x 11010 + y 01101 +z OHIO = 00000 

is equivalent to five linear equations, the first and fifth of which are x = 0 and v = 0. 
Together with any of the remaining three equations, these yield z = 0. 

Let r £ = J^(B), the linear code with basis B. From Definition 6.1.11, w S # if 
and only if u ■ w = 0 for all if and only if u ■ w = 0 for all u € B. If 

w = x 1 x 2 x 3 X4X5, then 1 1010 • w = x x + x 2 + x 4 , 01 101 • w = x 2 + x 3 + x 5 , and 
01 1 10 • w = x 2 + x 3 + X4, i.e., w <G < ^ ± if and only if 

xi + X2 + X4 = 0, (6.7a) 
x 2 + x 3 + x 5 = 0, (6.7b) 
x 2 + x 3 + x 4 = 0. (6-7c) 

This homogeneous system of linear equations is equivalent to the single matrix 
equation Gw l = 0, where 0 is the 3x1 column vector of zeros, w' is the 
transpose of w (the 5x1 column vector whose ith component is x,), and 

/ 1 1 0 1 0\ 
G= 0 1 1 0 1 
\0 1 1 1 0/ 
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is the 3 x 5 matrix whose rows are the basis vectors of ( €. In other words, w e if 
and only if w 1 belongs to the kernel of G. Thus, it appears that Equation (6.6) is a 
consequence of the well-known theorem from (ordinary) linear algebra that the sum 
of the rank and nullity of a k x n matrix is equal to n. To confirm that this result is 
still valid in Boolean linear algebra, let's walk through the proof for this example. 

Because its rows are a basis of matrix G has rank k = 3, and is equal to the 
row space of G. Recall that the row space of a matrix is unchanged by elementary 
row operations (a fact that remains valid in the context of Boolean arithmetic). 
Adding the second row of G to its first and third row produces (because Boolean 
addition and subtraction are the same) the row equivalent marix 

/ 1 0 1 1 1 \ 
G' = 0 1 1 0 1 . 
\0 0 0 1 1/ 

Adding the third row of G' to the first row yields the Hermite normal form (or 
reduced row echelon form)* of G, namely, 

/l 0 1 0 0\ 
G" = 0 1 1 0 1 . 
\0 0 0 1 1/ 

Because Gx = 0 and G"x = 0 have the same solution set, the linear system in 
Equations (6.7a)-(6.7c) is equivalent to 

xi +x 3 = 0, (6.8a) 
x 2 +x 3 +x 5 = 0, (6.8b) 
X4 +X5 = 0. (6.8c) 

Solving for the leading, pivot variables, we obtain 

X] = x 3 , 

x 2 = x 3 +x 5 , 

X\ = X5 , 

where the dependent (pivot) variables have been expressed as (linear) functions of 
the independent (nonpivot) variables x 3 and X5. These last three equations can be 
expressed as the single vector equation 

xix 2 x 3 X4X 5 =x 3 11100 + x 5 01011. (6.9) 



*See Appendix 3. 
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In other words, w = X1X2X3X4X5 G c £ if and only if w is a (Boolean) linear combi- 
nation of the linearly independent vectors 11100 and 01011, i.e., {11100,01011} is 
a basis for In particular, dim(^ x ) = 2. Thus, dim(^) + dim(<^ x ) = 3 + 2 = 5, 
the length of the codewords. 

It is interesting to observe the dominant role played by bits X3 and X5 in 
Equation (6.9). Once the binary words (vectors) 11100 and 01011 have been inden- 
tified, C £ L is completely determined by x 3 and X5. What is the use of x\, x 2 , and X4? 
The answer is clear from Equations (6.8a)-(6.8c), where these pivot bits can be 
seen to play the role of parity check digits! This is the sense in which the idea lead- 
ing to ^ + in Example 6.1.9 can be generalized to obtain error-correcting codes that 
are vastly superior to repetition codes. 

Finally, interpreting Equations (6.8a)-(6.8c) to mean that w G ^ x if and only 
if w 10100 = 0, w 01101 = 0, and w 00011 = 0 reminds us that the 
word "orthogonality" has been borrowed from another context. In Boolean linear 
algebra, orthogonality should be interpreted in terms of parity. □ 

Proof of Lemma 6.1.18 (i.e., dim(f€) + dim(f€ L ) = n). Suppose B is a basis of the 
(«, 2 k , d) linear code ( i. Let G be the k x n matrix whose rows are the vectors in B. 
Then dim(^) = k = rank(G), the number of pivot variables in the Hermite normal 
form of G. Because of the ientification of ( £ L with the kernel of G, 
dim(^ x ) = nullity (G) is the number of nonpivot variables of G. It remains to 
observe that the total number of variables is equal the number of columns of G. 



Proof of Theorem 6.1.14 (i.e., £C{S) = S xx ). Suppose 0 ^ S C F n . Let B C S be 
a basis of =Sf(5), and G the k x n matrix whose rows are the vectors in B. Then 
S x = {v € F n : Gv x = 0} = y(S) L . Therefore, dim(S x ) = dim(if(5) x ). 
It follows from Lemma 6.1.18 (and Theorem 6.1.13) that 

dim(5 x ) + dim(5 xx ) =n = dim(JS?(S)) + dim(^(5) x ). 

Subtracting dim(5 x ) from the left-hand side and dim{^{S) L ) from the right leaves 
dim(S xx ) = dim( = Sf (5)). Since it was established in the discussion leading up to 
the statement of Theorem 6.1.14 that JS?(5) C 5 XX , the proof is complete. ■ 

The notions that emerged in Example 6.1.19 have implications far beyond the 
proofs of Lemma 6.1.18 and Theorem 6.1.14. Let's summarize their most striking 
features. 

6.1.20 Definition. Let B be a fixed but arbitrary basis of the linear («, 2 k , d) code 
( €. The k x n matrix G whose rows are the vectors of B is a generating matrix for < €. 

6.1.21 Theorem. IfG is a generating matrix for the linear code c €, then w <E ^ x 
if and only if Gw l — 0, if and only if w l G ker(G), the kernel of G. 
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6.1. EXERCISES 

1 Compute 

(a) wt(l 10100010). 

(b) wt(001011101). 

2 Compute the Boolean sum 

(a) 110100010 + 001011101. 

(b) 110100010+ 110100010. 

3 Find a basis for i?(S) when 

(a) S= {1100,0011}. 

(b) S= {1110,0111}. 

(c) S= {1100, 1010, 1001,0110,0101,0011}. 

(d) S= {11000, 10100, 10010, 10001,01100,01010,01001,00110,00101, 
00011}. 

4 If <€\ and <^2 are linear codes, define ^ + ^ 2 = { u + v ■ and 

v G #2}. 

(a) Show that y>\ + ^ 2 is a linear code. 

(b) Show that + ( € 2 ) L = ( €\<T\ ^2 • 

5 Let S be the (nonlinear) binary code in Exercise 3(c). Exhibit all the binary 
words in SC(S)\S, the complement of S in £f(S). 

6 Let 5 be a basis of the linear (n,2 k ,d) code % '. Prove or disprove that 
d = min heB wt(b). 

7 A nonempty set S C F" is orthogonal if u ■ v — 0 for all u, v G S, u ^ v. Prove 
or disprove that an orthogonal set is linearly independent. 

8 Prove that wt(« + v) < wt(t<) + wt(v) for all u, v e F". 

9 Let S = {u G F" : wt(it) is odd}. Prove that jSf (S) is an («, 2", 1) code. 

10 Let ^ = {«ef" : wt(it) is even}. If n > 2, 

(a) prove that Sf(S) = 5. 

(b) prove that 5 is an («, 2" _1 , 2) linear code. 

11 Let <€ = {0000, 1100,0011,1111}. 

(a) Prove that ^ is a linear code. 

(b) Prove that <ig is self-dual, i.e., that ^ L = 

12 Suppose ^ is an (n,2 k ,d) linear code. As in Example 6.1.9, let r € + be the 
corresponding parity check code obtained from c £ by appending a single parity 
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check bit to the end of each codeword, i.e., by replacing xy . . . z S # with 
xy . . . zp S where p is the parity of xy . . . z. 

(a) Prove that ^ + is a linear code. 

(b) Prove that ^ and ^ + have the same dimension. 

(c) If d is odd, show that <? + is an (n + 1, 2 k , d + 1) code. 

(d) If J is even, show that r £ + is an (n + l,2 k ,d) code. 

13 A linear code c {o is self-dual if ^ x = ( £. 

(a) Prove that a self-dual (w, 2*, cf) linear code has dimension k~n/2. 

(b) Construct a self-dual linear code of length 8. 

14 Prove or disprove that a linear code of dimension k has (exactly) 

^ri( 2 *- 2 ') 

different (unordered) bases. 

15 Find a basis for the dual code C 6' L , where c {o = if (B) is the linear code with basis 

(a) B = {10000,01000,00100}. 

(b) B= {110111,111101,110011}. 

16 Let ^ be a (not necessarily linear) (n,M,d) binary code. The weight 
enumerator of ^ is the two-variable polynomial 

W^(x,y) = ^x wt(e) y-^M. 

F. J. Mac Williams (1917-1990) discovered a relation between the weight 
enumerators of a linear code and its dual, namely, 

ii7 t \ w v{y~x,x + y) 

(*, y) = ^ • 

Confirm this identity for 

(a) the code # = if (5) in Example 6.1.17. 

(b) the code # = if (5) in Example 6.1.19. 

(c) the code % — if (S) in Exercise 3(c). 

(d) the self-dual code % in Exercise 11. 

(e) the code ^ = if (B) in Exercise 15(a). 

(f) the code ( £ = SC(B) in Exercise 15(b). 

17 Prove or disprove that the number of different 
(a) k x n marices over F = {0, 1} is 2 nk . 
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(b) k x n reduced row echelon form marices of rank k over F is 

r—l 

(c) A:-dimensional subspace of F" is C2(n, k). 

18 (L. Lovasz) Let A = (ay) be an n x n, symmetric (0, l)-matrix. Let # be the 
(Boolean) row space of A. Prove or disprove that diag(A) G where diag(A) 
is the binary word a\\an ■ ■ ■ a nn . 

19 Give a direct proof of Equation (6.5), i.e., one based on Definitions 6.1.6 and 
6.1.11. 



6.2. DECODING ALGORITHMS 

Human history becomes more and more a race between education and catastrophe. 

— H. G. Wells 

Recall, from Theorem 6.1.21, that if G is a generating matrix for the linear code ( €, 
then m G ( £ L if and only if Gi/ = 0. Turning this around, w G — C <S LL if and only 
if Hw l = 0, where H is a generating matrix for <^ x . Let's investigate this back-door 
way of defining ( €. 

It follows from Definition 6.1.20 that an m x n, (0,l)-matrix H is a generating 
matrix for some linear code if and only if its rows are linearly independent. As a 
warm-up exercise, fix an arbitrary integer m > 2, let n = 2 m — 1, and define H m to 
be the m x n matrix whose yth column is the (transposed) binary numeral for j, 
1 < J 1 < n - Then 



H 7 



0 


1 










1 


0 


:) 


1, 
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0 
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1 


1 


1 


0 


1 


1 


0 


0 


1 




0 


1 


0 


1 


0 



H 3 = I 0 1 1 0 0 1 1 I , (6.10) 



and 

H m +\ = [ t? n "' ' - 

V M m u m tl„ 

where Z m and O m are the 1 x (2 m — 1) and m x 1 zero matrices, respectively, and 
U m is the 1 x (2 m — 1) matrix each of whose entries is 1. (Confirm that 
rank(// m+ i) = m + 1, m > 1.) 
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Let ^ ffl € F n be the m-dimensional linear code generated by H m , and define 



/„ = ^ = {vef" : T^v* = 0}. Then, by Lemma 6.1.18, Jf„ is an (n, 2"-" 1 , 
d) linear code, where n = 2 m — 1 and 



Observe that <i = 1 only if there is a codeword w S m of weight 1 . If the single 
nonzero bit of w = 0 ... 0100 is the y'th, then H m w x is equal to column j of H m . But, 
w € ,?f „, if and only if H m w l = 0. Because no column of H m is zero, no codeword 
of Jf m can have weight 1. What about 2? If w S has exactly two nonzero bits, 
say bits and j, then // m vv' is the sum of column i and column j of //„,. But, the 
columns of H m are just (transposed) binary words from F m . If u, u! S F m , then 
the (Boolean) sum u + u' = 0 if and only if m = u'. Since no two distinct integers 
have identical numerals, no two columns of H m are the same. Therefore, d > 3. 
Finally, for all m>2, H m w l = 0 when w = 1 1 100 . . . 0. (If m = 2, there are no 
zeros in w.) Thus w € Jf m , so cf < 3. 

6.2.7 Definition. For a fixed but arbitrary integer m > 2, let n = 2 m - 1 and 
define i/ m to be the m x n matrix whose y'th column is the binary numeral for j. 
Then the mth Hamming code is the (w,2"~ m ,3) linear code .^C m ~ 
{weF n : H m w l = 0}. 

Recall that a code of length n is perfect if F" is the disjoint union of the "spheres 
of influence" of its codewords. 

6.2.2 Theorem. Ifm>2, then M'm is a perfect, 1 -error-correcting, linear code. 

Proof. Only the perfection of ,W m remains to be proved, and that is an immediate 
consequence of Lemma 1.4.14. ■ 

6.2.3 Example Definition 6.2.1 gives an implicit (back-door) description of 
J^ m . Let's find an explicit description, e.g., of .5^3. 

By definition, .5^3 = kex{Hj), the kernel of H3, also known as the orthogonal 
complement of the row space of Ht,. Since the row space of a matrix is left 
unchanged by elementary row operations, ,#3 could just as well be described as 
the orthogonal complement of the row space of the Hermite normal form of Hj. 
From Equation (6.10), 



d = 



min wt(vv). 



H 3 



( 



0 0 0 1 1 1 1 
0 110 0 11 
10 10 10 1 



) 



Interchanging rows 1 and 3 produces its Hermite normal form 




10 10 10 1 
0 110 0 11 
0 0 0 1 1 1 1 




(6.11) 
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(with pivot columns 1, 2, and 4). So, w = x\x 2 
if and only if 



• x 7 G ^3 if and only if = 0; 



if and only if 



X\ + x 3 + x 5 + x 7 = 0, 
x 2 + x 3 + x 6 + x 7 = 0, 
x 4 + x 5 + x 6 + x 7 = 0; 



X] = x 3 + x 5 + x 7 , 
x 2 = x 3 + x 6 + x 7 , 
x 4 = x 5 +x 6 +x 7 ; 



(6.12a) 
(6.12b) 
(6.12c) 



if and only if 

Xl x 2 ■ ■ ■ x 7 = *, 1 1 10000 + x 5 1001 100 + x 6 0101010 + x 7 1 101001 . (6.13) 

Evidently, B = {1110000, 1001100,0101010, 1101001} is a basis for ,3f 3 and, 
therefore, 



/l 1 1 0 0 0 0\ 
10 0 110 0 
0 10 10 10 

yi loiooi/ 



is a generating matrix for 3. 



(6.14) 



□ 



Suppose w G ,?f m C F" is sent and v G i 7 " is received. Because Jf m is perfect, v 
lies in the sphere of influence, 5i(c), of some (unique) codeword c* Thus, c will be 
the output of any (valid) nearest-neighbor decoding algorithm. The missing piece is 
the algorithm. 

Recall that ,^f m consists of M — 2 2 "'~ m ~ 1 codewords. That "tower of exponents" 
indicates that M is likely to be BIG. While having a large vocabulary is good for 
composing messages, it means decoding algorithms based on computing d(v, c), for 
all c G fflm and all v in the message, are likely to be slow. Is there an alternative? 
Yes, that's the best part! 

Let u — H m v l . If u — 0, then v G 3*f m and c = v, i.e., no correction takes place. If 
u ^ 0, then v is not a codeword. In that case, c is the unique codeword that differs 
from v in a single bit. If we just knew which bit that was, changing it would yield c; 
if c differs from v in the y'th bit, then c = v + b, where b = 0 ... 010 ... 0 is the 
binary word whose only nonzero bit is the jth. Here is the easy way to find j. 



•Recall that S r (c) = {>• G F" : d(c,y) < r}, where r = [(d - 1)/2J. 
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In Boolean arithmetic, c = v + b if and only if c + b = v. Together with the fact 
that H m c l = 0, this yields 



the y'th column of H m . Evidently, all one needs do is scan the columns of H,„ looking 
for u. Locating u in the jth column of H,„ means that c differs from v in the jth bit! 

The bad news is that H m has n = 2 m — 1 columns. That's more than a million, 
even for m as small as 20. The good news is that j can be found without scanning all 
of the columns of H m . In fact, it can be found without scanning any columns! 

Recall that H m is not just some random m x n matrix. It is the unique m x n 
matrix whose jth column is the binary numeral for j. Therefore, u is the yth column 
of H m if and only if u is the binary numeral for / From the perspective of the base 
2 numeration system, u = j. 

Let's summarize. When binary word v is received, it is decoded as 



where the binary word b is determined by u = If u = 0, then b — 0; if u ^ 0, 
then b has a single nonzero bit in the yth position, where j is determined by convert- 
ing the binary numeral u to base 10. 

6.2.4 Example. To find the codeword c e Jt 3 nearest to v = 0101 100, observe 
that 



u = H m V 



= H m (c + by 

= H m c l + H m b l 
= H m b , 



(6.15) 



c = v + b 



(6.16) 



/0 0 0 1 1 1 1 
u = // 3 v l =0110011 
\1 0 1 0 1 0 1 



) 



1 

0 

1 
1 

0 



0 

1 
1 



) 



Because d = 



011 is the binary numeral for j = 3, b = 0010000 and 



c = 0101 100 + 0010000 



= 0111100. 



(Confirm that u is the third column of Ho,.) 



□ 



A formal nearest-neighbor decoding algorithm for Hamming codes might look 
something like this: 
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6.2.5 Algorithm. The codeword c G 2f? m nearest to v G F" is determined as 
follows: 

1. Let u = H m v'. 

2. Ifu> = 00. ..0 G F m thenb = 00...0e F". Go to step 5. 

3. Let j be the integer whose binary numeral is u'. 

4. Let b = 0 • • • 010 • • • 0 G F" where the jth bit from the left is 1. 

5. Return c = v + b. □ 



If w € ,3f m is sent and v G F" is received, won't Algorithm 6.2.5 sometimes 
yield the wrong codeword? The answer depends on what is meant by wrong. If 
more than one bit of w is changed in transmission, no (valid) nearest-neighbor 
decoding algorithm will correct v to w. If the transmission channel is noisy enough 
for more than one error to occur with unacceptably high probability, a code that can 
correct more than one error should be chosen! If the code of choice is J^f m , then 
Algorithm 6.2.5 produces the right (nearest-neighbor) codeword! 

In preparation for the more challenging problem of decoding general linear 
codes, it will be helpful to review the key steps that led to Algorithm 6.2.5. As 
in Example 6.1.19, Equations (6.12a)-(6.12c) show that the dependent pivot vari- 
ables x\, X2, and X4, can be viewed as parity check digits. This is the source of the 
following terminology. 

6.2.6 Definition. Let c £ be a linear («, 2 k , d) code. A parity check matrix for r £ is 
a generating matrix for C € L . 

Evidently, H is a parity check matrix for 'Yo if and only if # = 
{v6P : Hv l = 0}. Because the dimension of a linear («, 2 k , d) code is k, the 
dimension of its dual code is n — k. Therefore, H is a parity check matrix for 
some (n, 2 k , d) linear code if and only if H is an (n — k) x n, (0, l)-matrix of 
(Boolean) rank n — k. 

6.2.7 Example. The m x n matrix H m is the prototype parity check matrix. Its 
rows are a basis for ( £ m — ffl m , i.e., c € m is the row space of H m . An m x n, (0,1)- 
matrix H is a parity check matrix for J^ m , if and only if H and H m have the same 
row space, if and only if H and H m are row equivalent. Indeed, the Hermite normal 
form of Hj, given in Equation (6.11) is the parity check matrix from which 
Equations (6.12a)-(6.12c) came. □ 

As the key to Algorithm 6.2.5 u = H m v l also deserves a name. However, since 
row vectors are easier to typeset than column vectors, it is u l = vH l m that will 
receive the distinction. 

6.2.8 Definition. Let H be a fixed but arbitrary parity check matrix for the linear 
(«, 2 k , d) code ( €. With respect to H, the syndrome of v G F" is vH l . 

With respect to the (« — k) x n matrix H, the syndrome of v G F" is the product 
of v (viewed as a 1 x n matrix) and the n x (n — k) matrix H l . In particular, the 
syndrome of v is a 1 x (n — k) matrix (viewed as a binary word in F n ~ k ). 
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Are all these transposes really necessary?* After all, F m and the space of column 
vectors {v l : v € F m } are isomorphic. It seems as if we could save ourselves a lot 
of grief by overlooking the distinction between "=" (isomorphic) and "=" (equal). 
That this approach may be too simplistic is suggested by the fact that 3 and F 4 
are isomorphic vector spaces! 

One thing we can do is substitute something like s for u l in u l = vH l , writing, 
e.g., "the syndrome s = vH 1 ." 

6.2.9 Example. From Example 6.2.4, with respect to H3 the syndrome of 
v = 0101100 is 5 = 011. The reader may confirm that the syndrome of (the 
same) v with respect to the Hermite normal form of H3 (Equation (6.11)) is 
s 1 = vH l = 110 ^ s. Evidently, as implied by Definition 6.2.8, the syndrome of a 
binary word v S F" depends not only on the linear code but also on the parity check 
matrix used in its back-door definition. 

While it may be the binary numeral for 6, s' — 110 is the transpose, not of 
column 6, but of column 7 = 3 of H'. This should not come as a surprise. After 
all, using H' in the back-door definition of 3 doesn't alter the fact that the code- 
word c S ,#3, nearest to v = 0101100, is obtained by changing bit j = 3 of v. It is 
worth emphasizing that it is only the very special form of H m that permits the 
elegant (base 2 numeral) alternative to having to scan n = 2 m — 1 columns in search 
of u — H m v l . □ 

So much for warming up. It's time to consider a general linear (n, 2 k ,d) code c €. 
Suppose, as usual, that w € *<? is sent down a noisy transmission channel and v S F" 
is received. Then v = w + e, where the l's in e correspond to the places where v 
differs from w. Call e = w + v an error pattern. If we knew the error pattern, we 
could recover w = v + e. But, that is asking too much. The best we can hope for is a 
fast way to find a binary word b such that c = v + b is a codeword nearest to v. 

If v is contained in the sphere of influence of some c G < €, then c is the unique 
codeword nearest to v. But, each binary word in F" belongs to the sphere of influ- 
ence of some codeword, if and only if ^ is a perfect code. 1 ^ While no binary word 
can ever belong to the sphere of influence of more than one codeword, v can fail to 
belong to the sphere of influence of any. In the worst case, there may be several 
nearest-neighbor codewords, each the same distance from v. It seems we should 
add to the specifications for a nearest-neighbor decoding algorithm some mechan- 
ism for resolving such ambiguities. 

A necessary and sufficient condition for v + b = c to be a codeword is that 
He 1 = 0 for a fixed but arbitrary parity check matrix H. As in Equation (6.15), 
He 1 = 0 if and only if Hv l = Hb\ if and only if vH l = bH l , i.e., if and only if v 

*Some authors deal with the annoying transposes by defining, not H, but H 1 to be the parity check marix. 
In this approach a generating matrix for <ig ± is the transpose of a parity check matrix for ( €. In particular, 
some transposing is inevitable. 

f In a perfect world, there might be a perfect code for every purpose. In the real world, if ( € is an r-error- 
correcting binary code, with more than two codewords and satisfying r > 0, then 'f, is equivalent either to 
a Hamming code or to the [23,4096,7] Golay code ^23 found in Exercise 29 (below). 
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and b have the same syndrome with respect to H. A necessary and sufficient con- 
dition for v + b = c S ^ to be a nearest codeword to v is that the distance 
d(c, v) = wt(c + v) = wt(b) be as small as possible. Thus, v should be decoded 
as v + b = c, where b is a binary word of minimum weight among those having 
the syndrome s = vH l . 

Visualize a code book listing all 2" binary words in F". Imagine the book orga- 
nized into chapters, so that binary words v and b are in the same chapter if and only 
if vH 1 — bFF, i.e., if and only if v and b have the same syndrome. Because 
vH l = 0 € F n ~ k if and only if v S < €, one of the chapters consists of codewords. 
If the title of each chapter is the syndrome common to every word in it, then the 
chapter of codewords is Chapter 0. Finally, suppose the words in each chapter 
are organized into paragraphs, by weight, so that the first paragraph contains all 
the words of minimum weight. If is a perfect code, then the first paragraph of 
every chapter will consist of a single word. For an arbitrary linear code, the first 
paragraph of Chapter 0 will contain only 0 £ F n . In general, however, some chap- 
ters will begin with paragraphs containing more than one word. 

The following decoding strategy is an immediate consequence of having such a 
book. When binary word v is received, compute its syndrome s = vH l . Decode v as 
v + b = c, where b is the first word in Chapter s. (From the way in which the code 
book was assembled, b has minimum weight among those words with syndrome s. 
By our previous arguments, this means c is a nearest codeword to v. Note that the 
mechanism for resolving ambiguities is implicit in the arrangement of words that 
make up the first paragraph of Chapter s.) 

This strategy can, in fact, be implemented without the book! All we need is a 
table of contents that lists the titles and first words of each chapter. 

6.2.10 Definition. Let H be a fixed but arbitrary parity check matrix for the 
linear code c €. A standard decoding array for ^ is a table in which each syndrome 
s is matched with a minimum-weight binary word whose syndrome, with respect to 
H, is s. 

Why should the first word in an arbitrarily arranged first paragraph be the right 
choice for bl Because every word in the first paragraph is a right choice for b\ A 
more appropriate question is the extent to which a standard decoding array depends 
on the arbitrary parity check matrix H. 

6.2.11 Theorem. Suppose H and K are two parity check matrices for the same 
linear code c £. If binary words v and b have the same syndrome with respect to H, 
then they have the same syndrome with respect to K. 

As we saw in Example 6.2.9, the syndromes themselves may be different. In an 
//-based code book, the binary word v = 0101100 may belong to Chapter Oil, 
while in a ZT-based book it belongs to Chapter 110. Theorem 6.2.11 guarantees, 
however, that the first paragraph of Chapter 01 1 in the //-based book contains pre- 
cisely the same words as the first paragraph of Chapter 110 in the -based book. 
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Proof of Theorem 6.2.11: H and K are parity check matrices for the same linear 
code <<f if and only if they have the same row space (namely, code C £ L ), if and only 
if they are row equivalent, if and only if there is a (Boolean) invertible matrix E 
such that K = EH. Thus, Hv l = Hb l if and only if EHv 1 = EHb\ if and only if 



A formal algorithm based on the code book decoding strategy might look some- 
thing like this. 

6.2.12 Algorithm. Let H be a fixed but arbitrary parity check matrix for the 
linear code c €. Given a standard decoding array based on H, a codeword c S r -€ 
nearest to v S F" is obtained as follows: 

1. Compute the syndrome s — vH'. 

2. Let b be the word corresponding to s in the array. 

3. Return v + b = c. □ 

6.2.13 Example. Suppose S = {11100,01011,01110, 11001}. Let <6 = <£ (S). 
To implement Algorithm 6.2.12, we need a parity check matrix H. Because 
c <G % 'if and only if cH 1 = 0, it follows that GH 1 = 0 for any generating matrix G 
of c €. This identity also follows, of course, from the definition of H as a generating 
matrix for f £ L . Because ^ = {w : w x € ker(G)}, the rows of H are a (transposed) 
basis of the kernel of G. 

To find a generating matrix for c £ = if (S), consider matrix 



whose rows are the codewords in S. Because it is the row space of A, a basis for ^ is 
comprised of the nonzero rows in the Hermite normal form of A: Adding row 1 of A 
to row 4, and row 2 to rows 1 and 3, we obtain the row equivalent matrix 



Kv l = Kb 1 . 



A = 



(I 1 1 0 0\ 

0 10 11 

0 1110 

\ 1 10 0 1 / 



B = 



/ 1 0 1 1 1 \ 

0 10 11 

0 0 10 1 

\0 0 1 0 1/ 



Adding row 3 of B to rows 1 and 4 produces the Hermite normal form 



C = 



/l 0 0 1 0\ 

0 10 11 

0 0 10 1 

\0 0 0 0 0/ 
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Therefore, 

/l 0 0 1 0\ 

G= 0 1 0 1 1 (6.17) 

\0 0 1 0 1/ 



is a generating matrix for ( €. If w = X1X2X3X4X5, then w G ^ if and only if 
vv l G ker(G), if and only if 

xi + X4 = 0 

X2 + X4 + X5 = 0 

x 3 + x 5 = 0, 

if and only if 

X1X2X3X4X5 = X4 11010 = X5 01101. (6.18) 
Therefore, B = {11010,01101} is a basis for <^ x , and 



is a parity check matrix for ( €. 

If u = Hv l for some v G F 5 , then the corresponding syndrome 
s = 1/ = vH l G f' 2 . Evidently, the available syndromes are 00, 01, 10, and 11. 
Because 00 = 00000// 1 , and every nonzero binary word in F 5 has positive weight, 
we see (again) that the only possible pairing for the syndrome s = 00 in a standard 
decoding array for <€ is b — 00000. 

Let ej G F 5 be the word whose only nonzero bit is the jth, 1 <j<5. Then 
1 = wt(e / ) < wt(v) for every nonzero veF 5 . Because uj — He^ is the jth column 
of H, it is easy to see, e.g., that s = 01 = ej,H l = e^fF ^ bH, for any binary word b 
of weight 1 different from ej, and e^. It follows that 00100 and 00001 are the only 
possible pairings for syndrome s = 01 in a standard decoding array for ( £. Which is 
correct? Either! These two words comprise the first paragraph of Chapter 01 of the 
code book for ^ based on H. Pick one of them at random, or pick one using some 
arbitrary criterion, e.g., the smaller base 2 numeral. 

Similarly, one of 10000 or 00010 must correspond to s = 10. Finally, b = 01000 
is the unique word of minimum weight corresponding to syndrome 5=11. Using 
the smaller binary word as a tie breaker, we obtain the standard decoding array 
exhibited in Fig. 6.2.1. 
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Syndrome s = vH l Minimum weight b 



00 


00000 


01 


00001 


10 


00010 


11 


01000 



Figure 6.2.1. Standard decoding array for if (1 1 100, 0101 1, 01 1 10, 1 1001). 

Suppose, e.g., binary word v = 10101 is received over a transmission channel 
employing the code c €. With respect to the same parity check matrix H just used 
in the construction of the standard decoding array, vH l = 10. Because s = 10 is 
paired with the binary word b = 00010 in Fig. 6.2.1, v is decoded as 
v + b= 10101 +00010= 10111. (Confirm that c = 10111 €<<?.) □ 

The fact that the generating matrix in Equation (6.17) is of the form G = (h\X) 
means that we worked harder than necessary in Example 6.2.13. 

6.2.14 Theorem. If & is an (n, 2 k , d) linear code with a generating matrix of the 
form G = (h\X), then H = (X*|/ B _fc) is a parity check matrix for ( €. 

Proof. Because 

{h\X)( x ) =I k X + XI n _ k 

\ln-k J 

= x + x 

= 0, 

the columns of H l belong to the kernel of G, i.e., the rows of H belong to 
Because it is an (n - k) x n matrix of rank n - k, H is a generating matrix 
for^ x . ■ 

Note that the matrix H in Equation (6.19) is of the form (X 1 ^), where 
G = (h\X) is the matrix in Equation (6.17). 

6.2.15 Definition. A systematic linear code is one that has a generating matrix 
of the form G — (h\X), where X is a k x (« - k), (0, l)-matrix. 

If G is an arbitrary generating matrix of an arbitrary (n, 2 k ,d) linear code c €, then 
^ is a systematic linear code if and only if the (unique) Hermite normal form of G 
is (h\X). If follows from Theorem 6.2.14 that a parity check matrix for a systematic 
linear code is easily obtained from the Hermite normal form (shared by all) of its 
generating matrices. 
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Consider the linear code c {o" generated by 



/l 0 1 0 0 
G" = 0 1 1 0 1 
\0 0 0 1 1 



(6.20) 



Because G" is already in Hermite normal form, ( €" is not systematic. However, ( €" 
is "equivalent" to the systematic code of Example 6.2.13. 

6.2.16 Definition. Let ^\ and ^2 be two (not necessarily linear) codes. If the 
codewords of < & 2 can be obtained from the codewords of ( £\ by some systematic 
permutation of their bits, then c £ 2 is equivalent to 

Because the generating matrix G" of Equation (6.20) can be obtained by switch- 
ing columns 3 and 4 in the generating matrix G of Equation (6.17), the correspond- 
ing code ( 4" is equivalent to the code ( -€ of Example 6.2.13. Thus, it should be 
possible to modify the table in Fig. 6.2.1 so as to obtain a standard decoding array 
for <£". But how? 

Switching columns 3 and 4 of G is an elementary column operation. It can be 
achieved by multiplying G on the right by a permutation matrix P. If G" = GP, then 
(GP)(p- l H t ) = GH 1 = 0, i.e., H' n = P^H 1 . Since the inverse of a permutation 
matrix is its transpose, H" = HP. In this case, a parity check matrix for G" can 
be obtained from a parity check matrix for G simply by switching columns 3 
and 4 of H, i.e., 



Of course, finding a parity check matrix is only the first step in producing a standard 
decoding array. 

6.2.17 Example. This section begain with the construction of Hamming codes 
by means of generating matrices of their dual codes. Let's have a look at ^3 = 
in its own right. By definition, 



is a generating matrix for the (7,8,4) linear code <^ 3 = {0000000,0001111, 
01 1001 1,1010101,0111 100, 101 1010, 1 1001 10, 1 101001 }. From the perspective of 
^3, the matrix G in Equation (6.14) is a generating matrix. From the perspective of 
^3, the same matrix is the parity check matrix 




1 1 0 
1 0 1 





0 0 0 1 1 1 1 
0 110 0 11 
10 10 10 1 




H = 



l\ 1 1 0 0 0 0\ 
10 0 110 0 
0 10 10 10 

V 1 101001/ 



(6.21) 
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0 


1 


2 


3 


4 


5 


6 


7 




0000 


1101 


1011 


1000 


0111 


0100 


0010 


0001 



Figure 6.2.2 



Let's use H to construct a standard decoding array for ^3. Because it has four 
rows, the syndromes with respect to H are elements of F 4 . So, there are 2 4 = 16 
possible syndromes, of which so = 0000 is the title of the chapter containing the 
codewords. 

As in Example 6.2.13, let e, S F 1 be the binary word of weight 1 whose only 
nonzero bit is the v'th, so that Uj = He* is the jth column of H. Because the columns 
of H are all different, and no nonzero word has weight less than ej, we deduce that 
ej is the unique minimum- weight binary word having syndrome sj — wj = ejH 1 . So, 
sj must be paired with ej, in any standard decoding array based on H. This takes 
care of the eight syndromes listed in Fig. 6.2.2. Moreover, any binary word asso- 
ciated with a syndrome not listed in Fig. 6.2.2 must have weight not less than 2. 

The typical binary word of length 7 and weight 2 is of the form e, + e,, where 
i 7^ j. Observe that //(e, + ej) is the sum of columns i and j of H. Thus, e.g., the 
as-yet unlisted syndrome 0110= [e\ + e2)H t , and we may associate 0110 with 
e\ + e2 = 1100000. (To construct a standard decoding array, we don't need to 
know every word in the first paragraph of each chapter; it suffices to know one 
of them!) Similarly, the transposed sum of columns 1 and 3 of H is 0101, the syn- 
drome for e\ + e^. Because 0101 does not appear in Fig. 6.2.2, it is not the 
syndrome of any word of weight less than 2. So, we may as well pair 0101 with 
1010000 in our growing standard decoding array. 

Continuing in this way, it seems natural to pair 1010 with e\ + e$ = 1001000, 
1001 with e x + e 5 = 1000100, 1111 with e\ + e 6 = 1000010, 1100 with e x + e 1 = 
1000001, and 0011 with e 2 + e 3 = 0110000. The only remaining unmatched syn- 
drome is 1110. Because it is not the transposed sum of any two columns of H, there 
are two possibilities. Either 1110 is not the syndrome, with respect to this parity 



Syndrome 


Word 


Syndrome 


Word 


0000 


0000000 


1000 


0010000 


0001 


0000001 


1001 


1000100 


0010 


0000010 


1010 


1001000 


0011 


0110000 


1011 


0100000 


0100 


0000100 


1100 


1000001 


0101 


1010000 


1101 


1000000 


0110 


1100000 


1110 


0010110 


0111 


0001000 


1111 


1000010 



Figure 6.2.3. A standard decoding array for r (? 3 = 3f- 
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check matrix, of any binary word (ruled out by Exercise 23, below) or, in the code 
book based on H, every binary word in Chapter 1110 has weight greater than 2. In 
fact, 1110 is the syndrome of 2 4 = 16 words, of which v = 0010110 is one having 
weight 3. Pairing 0010110 with 1110 completes the standard decoding array for 
# 3 = ^3 exhibited in Fig. 6.2.3. □ 

There is nothing particularly fast about constructing a standard decoding array. 
Fortunately, it need be done only once. With a standard decoding array available, 
binary words can be decoded as fast as their syndromes can be identified. 



6.2. EXERCISES 



1 Using Boolean arithmetic, show that 



Confirm that the Hamming code of Example 1.4.15 is identical to the Hamming 
code of Example 6.2.3. 

Let <€ = ,?f 2 - 

(a) Compute the (n,M,d) parameters for ( €. 

(b) List (all) the codewords in c €. 

(c) Exhibit a generating matrix for c €. 

Let ^3 be the linear code generated by H 3 (Equation (6.10)), so that ^ = Jf 3 . 

(a) Show that ( -€ is not perfect. 

(b) Prove or disprove that ^ 3 C < ^ 3 L . 

Let <tf m be the dual of the Hamming code Jf m . 

(a) Show that ( -€ m has a basis in which every codeword has weight 2" , ~ 1 . 

(b) Does every nonzero codeword of ^ m have weight 2'"~ 1 ? (Justify your 
answer.) 

Find a systematic code equivalent to = if (5), when 

(a) 5= {10101, 10110,00011}. 

(b) S= {11100, 11110, 11111}. 

Find the (Boolean) Hermite normal form of the matrix 



(a) 



/I 
1 
1 

0 

\1 



0 0 1\ 

1 1 0 
1 0 0 

1 1 1 



(b) 



0 0 11/ 



/ 1 1 
1 1 

0 0 

1 1 

\1 1 



0 0 

1 0 

1 1 

0 1 



1 

0 

1 



0 110/ 
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(c) 



/ 1 0 1 0 1 \ 
0 10 10 
1110 1 

\ 1 1 1 1 0/ 



(d) 



/0 0 1 1 0 1\ 

0 10 10 1 

1 0 0 0 1 0 
\1 1 1 1 1 0/ 



8 Exhibit the parameters (n,2 k ,d) for the linear code r £ defined to be the row 
space of the matrix in the corresponding part of Exercise 7. 

9 Exhibit a parity check matrix for the linear code defined to be the row space 
of the matrix in the corresponding part of Exercise 7. 

10 Construct a standard decoding array for the linear code ( -€ defined to be the row 
space of the matrix in the corresponding part of Exercise 7. 

11 Let <g= ^(10010,01011,00101) be the code in Example 6.2.13. Use the 
standard decoding array of Fig. 6.2. 1 to decode 

(a) v= 11001. (b)v = 01010. (c) v = 00110. 

12 Let G = (h \X) be a generating matrix for the linear code ( €. Prove that is 
self-dual (i.e., <€ L = <€) if and only if XX 1 = X t X = I k . 

13 Let e £ = J4? 3 be the code in Example 6.2.17. Use the standard decoding array 
of Fig. 6.2.3 to find a nearest codeword to 

(a) v = 1101111. (b) v= 1001101. (c) v = 0101010. 

14 Let e £ — ffl^ be the code in Example 6.2.17. Use the standard decoding array 
of Fig. 6.2.3 to find a nearest codeword to 

(a) v= 1000011. (b) v = 0100101. (c) v = 0010110. 

(d) v= 1101010. (e) v = 1111111. (f) v= 1110001. 

15 Let G be the generating matrix for given in Equation (6.14). 

(a) Show that the Hermite normal form of G is of the form G' — (h\X). 

(b) Show that the Hermite normal form of the parity check matrix H — (X^Ij,) 
is identical to the matrix H' given in Equation (6.11). 

16 Let H be a parity check matrix for a linear («, 2 k , d) code c £. Suppose v€l 
satisfies wt(v) = d. 

(a) Prove that the d columns of H corresponding to the positions of the l's in v 
are linearly dependent. 

(b) If d > 1, prove that every selection of d — 1 columns of H is linearly 
independent. 

(c) Prove that d < n - k + 1. 



17 Find the codeword c G ,5^3 nearest to 
(a) v= 1011110. (b) v= 1010110. 
(d) v = 0001111. (e) v= 1110111. 



(c) v = 0110110. 
(f) v= 1101111. 
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18 Let K be the 5 x 32 matrix obtained from H 5 by adding a new first column 
consisting entirely of O's. Let G be the 6 x 32 matrix obtained from K by 
adding a new sixth row consisting entirely of l's. Let "J? be the linear code 
generated by G. (This is the first-order Reed-Muller code used in the Mariner 
missions to Mars.) 

(a) Show that ^ is a (32, 64, 16) code. 

(b) Prove that ^ is not a perfect code. 

19 Prove the statement in the text that, as vector spaces, ,?f 3 and F 4 are 
isomorphic. 

20 Given that 3 and F 4 are isomorphic as vector spaces, would you say that ^3 
and F 4 are isomorphic as codes? Explain. 

21 Let <6 = ^(10010,01011,00101) be the (5, 8, d) code from Example 6.2.13. 

(a) Find d. 

(b) List all words w € F 5 that have syndrome s = 1 1 e F 2 with respect to the 
parity check matrix of Equation (6.19). 

22 Let H be a fixed but arbitrary (n - k) x n parity check matrix for the («, 2 k , d) 
linear code ( €. Suppose s — vH l is the syndrome of v € F". Let 
I={w£f° : s = wH 1 } be the set of binary words having the same 
syndrome as v. Prove that X = {v + c : c € <<f}. 

23 Let be an (n, 2*, d) linear code. Show that any code book for c £ must contain 
exactly 2 n ~ k chapters, so that every element of F n ~ k is the syndrome of some 
binary word v G F n . 

24 Suppose G is a k x w generating matrix for a linear («, 2*, d) code ^. Define a 
function T : F k -» F" by r(v) = vG. Prove that 

(a) r is one-to-one. 

(b) r is onto 

(c) r is linear. 

(d) F k and ^ are isomorphic as vector spaces. 

25 A nonempty set S C F" is orthogonal if m • v = 0 for all «,v€S, «^ v. 

(a) Show that the rows of H2 are not orthogonal. 

(b) Show that the rows of H3 are orthogonal. 

(c) Prove or disprove that the rows of H m are orthogonal for all m > 3. 

26 Let G be the 4 x 7 matrix obtained from H 3 by adding a new fourth row 
consisting entirely of 1 's. Let <<? be the linear code generated by G. Prove that 

1g = Jf 3 . 

27 Let ^ be the 3 x 8 matrix obtained from H3 by adding a new first column 
consisting entirely of O's. Let G be the 4 x 8 matrix obtained from K by 
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Figure 6.2.4 



adding a new fourth row consisting entirely of l's. Find the parameters of the 
linear code *W generated by G. 

28 The extended Golay code ^24 used m the Voyager missions is the linear code 
generated by the matrix G = (/12IX), where X is the symmetric 12x12 matrix 
shown in Fig. 6.2.4. 

(a) Show that (X\I l2 ) is also a generating matrix for ^ 2 4- 

(b) Show that (X\I\2) is a parity check matrix for ^24- 

(c) Prove that ^24 is self-dual. 

(d) Show that ^ 2 4 is a (24, 4096, 8) code. 

29 The Golay code ^23 is obtained from ^24 (Exercise 28) by removing the last 
bit from every codeword. 

(a) Find the parameters of ^23- 

(b) Prove that ^23 is a perfect code. 

(c) Prove or disprove that ^23 is linear. 



6.3. LATIN SQUARES 

Growing tired of the debates, I was induced to amuse myself with making magic squares. 

— Benjamin Franklin (Autobiography) 

In the following two-person game, players G and B alternately choose numbers 
(without replacement) from {1,2,..., 9}. The first person to choose three numbers 
that sum to 15 is the winner. They need not be the first three numbers, or even some 
consecutive three numbers, but there must be three of them. The game is a draw if, 
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15-3 4 5-6-7-8-9 
~G: 2T~8 I B: 6 

Figure 6.3.1 

after all nine numbers have been chosen, neither player has three numbers that sum 
to 15. 

Figure 6.3.1 shows a game in progress. Three numbers have been chosen, 
namely, gi = 2, b\ = 6, and g2 = 8. It is B's turn. The choice bi = 9 does not result 
in a win for B. While 6 + 9 = 15, it is the sum of only two numbers. Since player B 
cannot hope to win on his second turn, the best he can do is block player G from 
winning by choosing £>2 = 5. Now it is G's turn, and she must choose gi from 
{1, 3, 4, 7, 9}. Since B has prevented her from winning on this turn, G's best strat- 
egy is to choose g3 = 4, presenting B with the "board" exhibited in Fig. 6.3.2. See- 
ing that either 3 or 9 produces a winning triple for G, while he has no winning move 
himself, B resigns. 

1 4- 3 -A 5- -6- 7 -8-9 
~G: 2, 8, 4 I B: 6, 5 

Figure 6.3.2 

Is there a strategy that guarantees a win for the first player? Not only does s/he 
have the first opportunity to win (at the third turn), but if the point is reached where 
all nine numbers have been chosen, s/he will have C(5, 3) = 10 triples from which 
to find a winning combination, while the second player will have only C(4, 3) = 4. 

Let's replay the game on the board illustrated in Fig. 6.3.3a. If we circle G's 
choices and cross out B's, then player B resigned at the point illustrated in 
Fig. 6.33b. 

Convince yourself that there are exactly eight winning combinations in the 
15-game, namely, {1,5,9}, {1,6,8}, {2,4,9}, {2,5,8}, {2,6,7}, {3,4,8}, 
{3,5,7}, and {4,5,6}. These correspond, via Fig. 6.3.3a, to the eight winning 
combinations in tic-tac-toe. Evidently, the 15-game is isomorphic to a game in 
which no strategy guarantees a win for the first player! 

6.3.1 Definition. A magic square of order n is an n x n array in which the 
numbers 1,2, ... ,n 2 are arranged so that each row and each column sums to the 
same (magic) number. 
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Figure 6.3.4. Franklin's magic square. 



The magic square of order 3 in Fig. 6.3.3a has some extra magic because the two 
diagonals also sum to 15. The magic square of order 8 in Fig. 6.3.4 (magic number 
260) was discovered by Benjamin Franklin (1706-1790). It, too, has some extra 
magic. If it is partitioned into four 4x4 blocks, then each of them is a pseudo 
magic square. (While the rows and columns of each of these blocks sum to 130, 
none of them contains [just] the numbers 1,2,. ..,16.) when it comes to extra 
magic, however, the grand prize goes to Leonhard Euler, whose magic square of 
order 8 is simultaneously a knight's tour of the chess board (Fig. 6.3.5). 
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Figure 6.3.5. Euler's magic square. 



For us, the significance of magic squares is that they illustrate an area of 
combinatorics concerned with the interplay between numerical constraints and geo- 
metric arrangements. Our study of more serious examples of this interplay begins 
with Latin squares. 

6.3.2 Definition. Let V be an w-element set. A Latin square based on V is an 
n x n matrix, each of whose rows and columns contains every element of V. A 
Latin square of order n is a Latin square based on some w-element set. 

6.3.3 Example. Matrices A = (ay) and B = (by) in Fig. 6.3.6 are Latin squares 
of order 4 based onV={0,l,2,3}. Taken together, this pair has some magic of its 
own. There are 4 2 = 16 ways to choose two elements from V, with replacement, 
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Figure 6.3.6. Orthogonal Latin squares. 

where order matters. The magic is that for every such ordered pair (s,t), there is a 
matrix location (ij) such that ay = s and by = t. The 4x4 array comprised of 
these ordered pairs, (ay, by), is exhibited in Fig. 6.3.7. 

Euler used arrays like this to construct magic squares. Convert each ordered pair 
of Fig. 6.3.7 into a two-letter word, obtaining 

/00 11 22 33 \ 

12 03 30 21 

4 ~ 23 32 01 10 ' 

V 31 20 13 02 J 



Now, forget that the elemens of C\ are words and think of them as numbers. Then, 
because each row and column sums to 66, C4 is a pseudo magic square. On the 
other hand, if we treat the elements of C4, not as base 10 numerals, but as numerals 
in base 4 then, upon converting them to base 10, we obtain 



C10 = 



( 0 5 10 15 \ 

6 3 12 9 

11 14 1 4 

\ 13 8 7 2) 



Adding 1 to each entry of C10 produces the genuine magic square 



1 6 11 16 

7 4 13 10 

12 15 2 5 

14 9 8 3 



□ 



6.3.4 Definition. Let A = (ay) and B — (by) be Latin squares of order n based 
on the elements of V. Then A and B are orthogonal if, for each ordered pair (s, t) of 
elements of V, there is a (unique) matrix location (ij) such that ay = s and by = t. 

"This use of "orthogonal" has no obvious connection either to perpendicularity or to parity. 
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(0,0) 
(1,2) 
(2,3) 
(3,1) 



(1,1) 
(0,3) 
(3,2) 
(2,0) 



(2,2) 
(3,0) 
(0,1) 
(1,3) 



(3,3) 
(2,1) 
(1,0) 
(0,2) 



Figure 6.3.7 



6.3.5 Example. The Latin squares of order 4 exhibited in Fig. 6.3.6 are ortho- 
gonal. If V = {x,y,z}, the Latin squares 



are orthogonal. (Confirm it.) 

Can you find an orthogonal pair of Latin squares of order 2? (Resolve this 



Euler discovered an algorithm for generating an orthogonal pair of Latin squares 
of order n, provided n does not occur in the arithmetic sequence 2, 6, 10, 14, .... In 
1782, defeated in his attempts to find an orthogonal pair of order 6, he conjectured 
not only that no such pair exists, but that there does not exist an orthogonal pair of 
Latin squares of order n — 4k + 2 for any k > 1. 

It wasn't until 1900 that G. Tarry confirmed the n — 6 case of Euler's conjecture 
using the unrevealing strategy of comparing all possible pairs of Latin squares of 
order 6. So, Euler was right about n = 6. It turns out, however, that he was wrong 
about every number in the sequence beyond 6. In 1960, the combined efforts of 
Euler, R. C. Bose, E. T. Parker, and S. S. Shrikhande established the following. 

6.3.6 Theorem. For every n, except « = 2 and n = 6, there exists an orthogo- 
nal pair of Latin squares of order n. 

Might there be more than two? What about three mutually orthogonal Latin 
squares of order 5, say? 

6.3.7 Theorem. There exist at most n — 1 mutually orthogonal Latin squares of 
order n. 

Proof. Let Ai, A2, . . . , A* be a family of mutually orthogonal Latin squares based 
on V = {1, 2, . . . , «}. Suppose the first row of A\ is x\,X2, . . . ,x n . Because A\ is a 
Latin square, x r occurs once in each of its rows and columns, 1 < r < n. Construct 
an n x n matrix B\, the (/J)-entry of which is equal to r if and only if the (i'J)-entry 
of Ai is equal to i r , 1 < r < n. Then B\ is a Latin square whose first row is 
1,2, ...,«. More remarkable is the fact that the family Si, A2, A3, . . . ,At is 
mutually orthogonal! To see why, suppose m G {2,3, ... ,k}. Let Si = (by) and 




question before proceeding any further.) 



□ 
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A m = (fly). If (ay, by) = (s,t) = (a pq ,b pq ), then by = t = b pq . So, x t is both the 
(ij')-entry and the (p,q)-entry of A\. But then (ay,x t ) = (a pq ,x t ), contradicting 
the orthogonality of A m and A\. 

Suppose the first row of A2 is yi,y 2 , . . ■ ,y„. Let B 2 be the matrix whose (i,j)- 
entry is equal to r if and only if the corresponding entry of A2 is y r , 1 < r < n. 
Then, B 2 is a Latin square whose first row is 1, 2, . . . , n and, by the same argument, 
Bi,B 2 ,A 3 ,A 4 , . . . ,A k is a family of mutually orthogonal Latin squares. Continuing 
in this way, we eventually obtain a family B\,B 2 , ... of orthogonal Latin 
squares each of which has the same first row, namely, 1,2, ... ,n. 

Denote the (2, l)-entry of B r by z r , 1 < r < k. Note that these z's are all differ- 
ent. If, for example, zi and z 2 were both equal to t, then t would be a common entry 
of B\ and B 2 in positions (l,t) and (2, 1), contradicting the orthogonality of B\ and 
B 2 . Moreover, if z r = 1 for some r, then B r would have two l's in its first column. 
Hence, there are at most n — 1 possible z's. ■ 

A family of n — 1 mutually orthogonal Latin squares of order n is said to be com- 
plete. It follows from Example 6.3.5 that there exists a complete family of mutually 
orthogonal latin squares of order n = 3. However, from Tarry's computations, there 
are not even two, much less five, mutually orthogonal Latin squares of order 6. For 
the purposes of the next result, it is convenient to stipulate that a single Latin square 
constitutes a mutually orthogonal family. 

6.3.8 Theorem. For every prime p, there exists a (complete) family of p — 1 
mutually orthogonal Latin squares of order p. 

Proof. Define a family A\,A 2 , ■ ■ ■ , A p _\ of p x p matrices as follows: The 
(ij')-entry of A, is the remainder when ti + j is divided by p. Evidently, the entries 
of A, come from the set V = {0, 1, . . . ,p — 1}. 

Suppose tii +j = P a i + r \ an d ti 2 +j= pq 2 + r 2 , where 0 < r\ , r 2 < p. Then r x 
is the (i x , /(-entry of A,, and r 2 is its (i 2 , j) -entry. If r\ — r 2 , then t(i\ — i 2 ) = 
p(q\ — q 2 ), which implies that p\t (i.e., p exactly divides t) or p\(i\ — i 2 ). Neither 
alternative is possible because both t and \i\ — i 2 \ are less than p. So, the entries in 
column j of A, are all different. A similar argument for row i of A, proves that A, is a 
Latin square, 1 < t < p — 1. 

To prove orthogonality, suppose x occurs in both the (iij'i) and the (i 2 ,j 2 ) posi- 
tions of A,, and y occurs in both the (iij'i) and the (12,72) positions of A s . That is, 
suppose 

ti\ +71 =PQ\ +x, 
th +72 =pq 2 +x, 

>«'i +71 =pq-i +y, 
sii +h =pqA + y- 

Then 



t(h - h) + (71 -72) =p(q\ ~ qi) 
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and 

s(h - h) + (j\ -h) =P(l3 ~ ?4), 

from which it follows that (t - s)(i\ - h) is a multiple of p, contradicting the fact 
that both \t — s\ and | z 1 — Z2I are positive and less than p. ■ 

Using the theory of finite fields, one can extend the proof of Theorem 6.3.8 and 
obtain the following stronger result. 

6.3.9 Theorem. Suppose p is a prime and a is a positive integer. Ifn = p a , there 
exists a (complete) family of n — 1 mutually orthogonal Latin squares of order n. 

It follows from Theorem 6.3.9 that, apart from 6, there are complete families of 
mutually orthogonal Latin squares for 2 < n < 9. The story for n = 10 takes us to 
the theory of finite projective planes, a topic that has no obvious connection to Latin 
squares. 

6.3.10 Definition. A projective plane consists of three things, a set of points, a 
set of lines, and an incidence relation, that satisfy the following axioms. 

1. For any pair of distinct points P and Q, there is a unique line L such that P 
and Q are both incident with L. 

2. For any pair of distinct lines L and M, there is a unique point P such that L 
and M are both incident with P. 

3. There exist four distinct points, no three of which are incident with the same line. 

Suppose A, B, C, and D are four different points, no three of which are collinear 
(incident with the same line). From Axiom 1, there is a unique line determined by A 
and B; let's call it AB. 

We claim that no three of the lines AB, BC, CD, and AD are concurrent (incident 
with the same point). Suppose, e.g., there were some point P incident with AB,BC, 
and CD. If P = A, then A, B, and C are all incident with line BC, contradicting the 
hypothesis. If P = B, then B, C, and D are all incident with CD, contradicting 
the hypothesis. If A ^ P ^ B then, by the uniqueness part of Axiom 1, the line 
AB = BP = BC is incident with A, B, and C, contradicting the hypothesis. So, 
AB, BC, and CD are not concurrent. Similar arguments work for the other three 
ways to select three lines from AB, BC, CD, and AD, proving the following. 

6.3.11 Theorem. There exist four distinct lines, no three of which are incident 
with the same point. 

It follows from Definition 6.3.10 and Theorem 6.3.11 that every theorem in the 
theory of projective planes has a "dual" in which the roles of points and lines are 
interchanged. This duality principle is of fundamental importance in the theory of 
projective planes. 
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6.3.12 Theorem. Let P and Q be points, and L and M be lines in a projective 
plane. Then there is a one-to-one correspondence 

(a) between the points incident with L and the points incident with M. 

(b) between the lines incident with P and the lines incident with Q. 

(c) between the points incident with L and the lines incident with P. 

Proof. The existence of a point O incident with neither L nor M is left to the exer- 
cises. For each point X incident with L, the distinct lines OX and M are incident 
with a unique point Y. This sets up a natural mapping / from the points of L to 
the points of M, namely /(X) = Y. lff{X r ) = Y =f(X 2 ), then X x and X 2 are both 
incident with the line OY. If X\ ^ X 2 then, by the uniqueness part of Axiom 1, 
L = X\X 2 = OY, contradicting the fact that O is not incident with L. This proves 
that / is one-to-one. If Y is incident with M, then Y ^ O. If X is the unique point 
incident with both L and OY, then/(X) = Y, proving that/ is onto. This completes 
the proof of part (a). 

Part (b) follows from part (a) by the duality principle. 

If K is a fixed but arbitrary line incident with O, there is a unique point X inci- 
dent with both K and L. So, the function g, from the lines incident with O to the 
points incident with L, defined by g(K) = X, is one-to-one. Because line OX is inci- 
dent with O for every point X of L, g is onto. Together with parts (a) and (b), this 
completes the proof of part (c). ■ 

6.3.13 Definition. A projective plane is finite if its set of points is finite. A finite 
projective plane has order n if there are exactly n + 1 points incident with every 
line. 

6.3.14 Corollary. A finite projective plane of order n has exactly n 2 + n+ 1 
points and n 2 + n + 1 lines. 

Proof. Let P be a point of a finite projective plane of order n. By Theorem 
6.3.12(c) and Definition 6.3.13, there are exactly n + 1 lines incident with P. Apart 
from P, each of these lines is incident with exactly n other points. Since every point 
is incident with one of these n + 1 lines, the plane contains exactly n(n + 1) points 
different from P, i.e., the total number of points is n 2 + n + 1. The corresponding 
enumeration of lines follows from the duality principle. ■ 

6.3.15 Example. Together with Axiom 3 of Definition 6.3.10, Corollary 6.3.14 
precludes the existence of a finite projective plane of order 1. By itself, Corollary 
6.3.14 requires that a finite projective plane of order 2 have a total of seven points. 

Let {1,2, ... ,7} be the set of points and {L\,L 2 , . . . ,L 7 } the set of lines, where 
U = {1,2,3}, L 2 = {1,4,7}, 1^ = {1,5,6}, L 4 = {2,4,6}, L 5 = {2,5,7}, 
L(, = {3,4,5}, Lj = {3,6,7}, and "P is incident with L" is interpreted to mean 
that P e L. Perhaps the easiest way to confirm that Axioms 1-3 are valid for this 




model is by means of Fig. 6.3.8 (in which six of the lines are represented by 
segments and L 4 is represented by a circle). □ 

We come, at last, to the connection between finite projective planes and ortho- 
gonal Latin squares. 

6.3.16 Theorem. Suppose n > 2. Then there exists a finite projective plane of 
order n, if and only if there exists a (complete) family of n — 1 mutually orthogonal 
Latin squares of order n. 

Proof. Let L be a fixed but arbitrary line in a finite projective plane of order n. Let 
P i,P2, ■ ■ ■ ,P n +i be the points that are incident with L and Ci , 62, • • • , 2n 2 the 
points that are not. Apart from L, there are exactly n distinct lines that are incident 
with P t , call them M (1 ,M ( - 2 , . . . ,M,„. The proof involves an (n + 1) x n 2 matrix C 
whose rows are indexed by P \,Pi, . . . ,P n +i and whose columns are indexed by 
2i> Q2, ■ ■ ■ , Qn 2 - The (Pi, Qj ■) -entry of C is Cy = t, where t is uniquely determined 
by the identity P t Qj = M it . 

Consider, e.g., the model of the finite projective plane of order n = 2 discussed 
in Example 6.3.15 (and Fig. 6.3.8). Then n + 1 = 3 and n 2 = 4. If L is the line 
L\ = {1,2, 3}, then the points incident with L are Pi = i, 1 < i < 3, and the points 
not incident with L are Qj =j + 3, 1 < j < 4. Let's find the (P2, <23)-entry of the 
3x4 matrix C corresponding to this scenario. 

Apart from L, the n = 2 lines incident with P2 = 2 are L4 = {2,4,6} and 
L 5 = {2,5,7}. Let M 2 \ = L 4 and M 2 2 — L 5 (an arbitrary choice). From 
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Example 6.3.15, the unique line determined by P 2 = 2 and Q3 = 6 is 
{2,4,6} = L 4 = M 2 i, i.e., P2Q3 = M 2 i. Together with the definition of C, this 
yields C23 = 1. 

The next step in the proof is to establish the following orthogonality property for 
the rows of this awkward matrix C. 

Property O. If 1 < < k < n + 1, then S = {(c y -, c kJ ) : 1 < j < n 2 } is the set of 
all n 2 ordered selections, with replacement, of two elements from {1,2, ... ,n}. 

To confirm Property O, suppose (c,>, Q r ) = (c is , q. s ). If the common value of c,> 
and Ci S is t then, from the definition of C, PjQ r = M„ = PiQ s . In particular, 
PiQr = PiQs- Similarly, PuQr = PkQs- If r =^ s, this implies that P, and Pi are 
both incident with line Q r Q s , i.e., Q r Q s = PiPk — L, contradicting the fact that 
neither Q r nor Q s is incident with L. 

Note that permuting the columns of C is equivalent to renaming the points not 
incident with L. The effect on the set S is to rearrange its elements, leaving S itself 
unchanged. Thus, rearranging the columns of C has no effect on Property O. 
Indeed, one consequence of Property O is that the columns of C can be rearranged 
to obtain a matrix B the first two rows of which are 

(1, 1, 1,2,2, ... ,2, 3, 3, ... ,3, ... .n.n, ... ,fi), (6.22) 

and 

(l,2,...,fi,l,2,...,fi,l,2,...,n,...,l,2,...,fi). (6.23) 

Now, for each r = 1,2, ...,« — 1, form the n x n matrix A r as follows: The first 
row of A r consists of the first n entries in row r + 2 of B. The second row of A r 
consists of the entries in columns n + 1 through 2n from row r + 2 of B, and so 
on. In general, the (i'j)-entry of A r is the entry in row r + 2 and column 
(i- l)n+j of B. 

Applying Property O to rows 1 and r + 2 of B yields that the entries in row i of 
A r are all different, 1 < i < n. (See Expression (6.22).) Applying Property O to 
rows 2 and r + 2 of B yields that the entries in column j of A r are all different, 
1 < J 1 < n - (See Expression (6.23).) Therefore, A r is a Latin square of order n 
based on {1,2, ...,«}, 1 < r < n. Finally, Property O guarantees that A r and A s 
are orthogonal whenever r ^ s. 

The converse is proved by reversing these steps. Given n — 1 mutually orthogo- 
nal Latin squares of order n, form an (n + 1) x n 2 matrix B whose first two rows are 
given by Expressions (6.22) and (6.23), respectively, and whose (r + 2)nd row 
comes from the rows of A r laid down one after another. For this part of the proof, 
no rearrangement of columns is necessary. One can (re)construct from matrix 
C = B a finite projective plane of order n. The details are omitted, but see Example 
6.3.18 (below). ■ 

6.3.77 Example. In the midst of the proof of Theorem 6.3.16, we evaluated c 2 $ 
with respect to the choices L — L\ = {1, 2, 3}, P t — i, 1 < i < 3, Qj = j + 3, 
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1<;<4, M 2l =U = {2,4,6} = {P 2 ,QuQ3}, and M 22 = L 5 = {2, 5, 7} = 
{Pi, Q2, Qa} from the model of the finite projective plane of order n = 2 in Exam- 
ple 6.3.15. With the (arbitrary) choices M u = L 2 = {Pi, Q\, Qa}, M l2 = L 3 = 
{PuQi, Q3}, M31 = U = {P3,QuQi}, and M 32 = hi = {P3, Q3, Qa}, the entire 
matrix 

/ 1 2 2 1 
C= 1 2 1 2 
\ 1 12 2 

Observe that the rows of C are, indeed, mutually orthogonal in the sense that 
S = {( c ij, c kj) ■ 1 <j < 4} is the set of all four ordered selections, with replace- 
ment, of two elements from {1,2}, 1 < i < k < 3. The matrix obtained from C 
by interchanging its second and fourth columns is 

/ 1 12 2 
B= 1 2 1 2 
\ 1 2 2 1 

the first two rows of which have the form prescribed by Expressions (6.22) and 
(6.23), respectively. Finally, the Latin square emerging from the third row of B is 



6.3.18 Example. Let's use the mutually orthogonal Latin squares of order 3 
from Example 6.3.5 to construct a finite projective plane of order n = 3. Replacing 
x, v, and z with 1, 2, and 3, respectively, yields 

/l 2 3\ /l 2 3\ 

A x = 2 3 1 and A 2 = 3 1 2 . 

\3 12/ \2 3 1/ 

Laid out end to end, the rows of A\ and A 2 generate rows 3 and 4, respectively, of 





Qi 


Qi 


23 


Qa 


Gs 


Ge 


Gt 


Gs 


G9 


Pi 


( 1 
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p 2 
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3 


P3 
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Pa 


V 1 


2 
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3 


1 



the rows of which are indexed by P„ l</<n+l = 4, and the colums by Qj, 
1 <j<n 2 = 9. Together with the orthogonality of the pair Ai,A 2 , the first two 
rows guarantee that B satisfies Property O. 
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The idea behind the proof of Theorem 6.3.16 is that {Pi , P 2 , P 3 , P4} U 
{Qi, Q2, ■ ■ ■ , Q9} comprises the 3 2 + 3 + 1 = 13 points of a projective plane of 
order 3. Apart from L = {P\,P 2 ,P 3 ,P4}, the remaining 12 lines of this plane 
can be read off from matrix C = B: 

Mn = {Pi,e 1 ,e 2 ,G 3 }, M u = {P l ,Q 4 ,Q 5 ,Q 6 }, M 13 = {P 1 ,Q 7 ,Q & ,Q 9 }, 
M 2 i = {P 2 ,QuQ4,Qi}, M 22 = {P 2 ,Q 2 ,Q 5 ,Q & }, M 23 = {P 2 ,Q3,Q 6 ,Q9}, 
M n = {P 3 ,Q u Q 6 ,Q s }, M 32 = {P 3 ,Q 2 ,Q 4 ,Q 9 }, M 33 = {P 3 ,Q3,Q 5 ,Qi}, 
M il = {P i ,Q u Qs,Q9}, M, 2 = {P 4 ,Q 2 ,Q 6 ,Q 7 }, M A3 = {P 4 , Q3, g 4 , Gs}, 

where P,Qj = Mi, if and only if Cy = t. Observe that each of these lines is incident 
with (contains) n + 1 = 4 points. Confirm that each point is incident with (con- 
tained in) exactly 4 lines. 

To prove that this configuration satisfies Axioms 1-3 of Definition 6.3.10, 
observe that the unique line incident with two of the P's is L. The unique line inci- 
dent with P, and Qj is Mi,, where t = cy. The unique line incident with Q r and Q s is 
Mj t , where i and t are uniquely determined by c,> = t = c, s . (Property O implies that 
Ci r = Cj S and Q r = q s cannot both hold unless i = k.) 

The unique point incident with L and M, ( is P,. The unique point incident with 
Mi, and My is P, if k — i; otherwise, it is Q r , where r is the column of C determined 
by Ci r = t and Q r = j. Finally, no three of the points P\,P 2 ,Q 3 , and Q4 are incident 
with the same line. □ 

R. H. Brack and H. J. Ryser independently discovered a necessary condition for 
the existence of a projective plane of order n. If d is the largest (perfect) square 
factor of n, then njd is the square-free part of n. 

6.3.19 Bruck-Ryser Theorem. Suppose n is of the form 4k + 1 or 4k + 2. If 

the square-free part of n contains a prime factor of the form 4k + 3, then there 
does not exist a finite projective plane of order n. 

6.3.20 Example. The square-free integer 6 = 4(1) + 2 contains a prime factor 
3 = 4(0) + 3. So (as we already know from other considerations), there is no finite 
projective plane of order 6. While 10 = 4(2) + 2 is also square-free, neither of its 
prime factors is of the form 4k + 3. So, the Bruck-Ryser theorem is silent on planes 
of order 10, a topic to he continued. □ 



6.3. EXERCISES 

1 Let m be the magic number for a magic square of order n. Find a formula that 
expresses m as a function of n. (Conclude that any two magic squares of the 
same order have the same magic number.) 
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2 Prove that there is no magic square of order 2. 

3 Using the orthogonal Latin squares in Example 6.3.5, mimic the approach used 
in Example 6.3.3 to construct a magic square of order 3. 

4 The 52 cards in a standard bridge deck come in four suits (clubs, diamonds, 
hearts, and spades) each headed by four honors (jack, queen, king, and 
ace). 

(a) Show that the 16 honor cards can be arranged in a 4 x 4 array in such a 
way that every row and every column contains cards representing all four 
suits and all four honors. 

(b) Explain how the arrangement in part (a) can be viewed as a model for two 
orthogonal Latin squares of order 4. 

5 Exhibit a family of three mutually orthogonal Latin squares of order 4 each of 
which has the same first row. 

6 Let A = (ay) be an n x n matrix. A (generalized) diagonal of A is a sequence 
(aip(i), fl2p(2), • • • , a np(n)), where p <E S n . If A is a Latin square on V, a 
transversal of A is a diagonal that contains every element of V. If B = (by) 
is another Latin square based on V, show that A and B are orthogonal if and 
only if, for all x € V, the elements of {ay : by = x} are the terms of a 
transversal of A. 

7 Prove that a Cayley table for a (finite) permutation group G is a Latin square 
based on V = G. 

8 Construct a magic square of order 6. (This is not an easy exercise.) 

9 Prove that magic squares of order n exist for every n^2. 

10 A Latin square is self -orthogonal if it is orthogonal to its transpose. 

(a) Prove that there is no self-orthogonal Latin square of order 3. 

(b) Exhibit a self-orthogonal Latin square of order 4. 

11 Say that two Latin squares are equivalent if it is possible to obtain the second 
by permuting the rows and columns of the first. Exhibit two inequivalent Latin 
squares of order 4. 

12 If a finite projective plane has 183 points, how many lines are incident with 
each one of them? 

13 Explain why the Bruck-Ryser theorem does not supersede Tarry's theorem. 

14 Use the Bruck-Ryser theorem to prove the nonexistence of a finite projective 
plane of order 

(a) 14. (b) 21. (c) 22. 

15 Construct a family of four mutually orthogonal Latin squares of order 5. 
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16 Let C = (Cy) and R = (r y ) benxn matrices defined by Cy = i, 1 </' < w, and 
r y =j, 1 < * < »■ 

(a) Show that C and are orthogonal, i.e., for each ordered pair (.y, ?), 
1 < s,t < n, there is a (unique) matrix location (ij) such that Cy — s and 

ry = t. 

(b) Show that A is a Latin square based on {1,2, ... ,n} if and only if A is 
orthogonal to both C and R. 

17 If A — (fly) and B = (by) are m x m and n x n matrices, respectively, their 
Kronecker product is the mw x mn block partitioned matrix 



a 2 \B 



ayiB 
aiiB 



\a m \B a ml B 



a\ m B \ 
aimB 

a mmB / 



where 



ayB 



( Oijbu 
aijb 2 \ 



aybu 
aijb 2 2 



\ayb n \ ayb„2 



• ■ ayb\„ \ 
' • a-ijbin 



®ijbnn J 



1 < i,j < m. 



Compute A ® B if 



(a) A = [ I 2 4 ] and fi 




(b) A = 7 3 and 5 = 
(O A ( ' H and fi 



1 1 
1 1 




(d) A = 1 1 1 and B 



1 2 
3 4 



18 Suppose Ai and A2 are a pair of orthogonal Latin squares of order m and L\ 
and L2 are a pair of order n. Prove that A\®L\ and A2 ® L2 are a pair of oder 
mw. (See Exercise 17.) 

19 Suppose n = pfp^ 2 ■ ■ -p a /- Let k = min{/?f : 1 < i < r}. Prove that there 
exists a family of k — 1 mutually orthogonal Latin squares of order n. 
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20 Use the following pair of orthogonal Latin squares of order 10 to generate a 
magic square of order 10: 
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(a) Prove that A does not have an orthogonal mate, i.e., show that there is no 
Latin square B of order 5 such that A and B are orthogonal. 



(b) Explain why this does not contradict Theorem 6.3.8. 

(c) Find a Latin square of order 4 that does not have an orthogonal mate. 

22 Prove the existence of the point O used in the proof of Theorem 6.3.12. 



6.4. BALANCED INCOMPLETE BLOCK DESIGNS 

We feel as if we were free; consider Nature as if she were full of special designs; 
lay plans as if we were to be immortal; and we find then that these words do make 
a genuine difference in our moral life. 



Perhaps the easiest way to describe a finite projective plane of order n is by means 
of (0, l)-matrices. 

6.4.1 Definition. Let P\, Pi, ■ ■ ■ , P m and L\ , L2, . . . , L m be the points and lines, 
respectively, of a finite projective plane of order n (so that m = n 2 + n + 1). Then 
the corresponding m x m incidence matrix A = (ay) is defined by 



— William James (The Principles of Psychology) 




1 if Pi is incident with Lj, 
0 otherwise. 
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Figure 6.4.1. Incidence matrix for a projective plane of order 3. 



6.4.2 Example. The incidence matrix for the plane of order 3 constructed in 
Example 6.3.18, with points Pi,P 2 ,P 3 ,P 4 , 2i, 22, ■ ■ • , Q9, and lines 

Mil = {Pi, QuQl, 2s}, M l2 = {Pi, 24,25,26}, M13 = {/>!, 2?, 28, 29}, 
M 2 1 = {P2,2l,24,2v}, M 22 = {P 2 ,22,25,28}, M 23 = {P 2 , 23, 26, 29}, 

M 3 i = {P 3 ,2i, 26,2s}, M 32 = {P 3 ,22,24,29}, M 33 -{P 3 , 2 3 , 2s, 2 7 }, 
M 4 i = {P 4 ,2i,2 5 ,29}, m 42 = {p 4 ,22,26,27}, m 43 -{p 4 ,23,2 4 ,28}, 

and L = {Pi,P 2 ,P 3 ,P 4 }, is exhibited in Fig. 6.4.1. □ 



It is hard to took at this matrix and not see binary words! Consider the code 
^ C P 13 , the codewords of which are the rows of this incidence matrix. While ^ 
may not be linear (0 ^ ( £ ), it has other interesting properties. For example, because 
each point of the plane is incident with four lines, every codeword has weight 4. 
Since two points in the projective plane determine a unique line, the ones in two 
(different) rows of its incidence matrix overlap in exactly one place. Thus, if 
c\ , C2 S ^, c\ ^ c 2 , then the distance 

d(c u c 2 ) = [wt(ci) - 1] + [wt(ca) - 1] 
= 6, 

i.e., e £ is a (13, 13,6) code. These properties have the following obvious general- 
izations. 

6.4.3 Theorem. If A is an incidence matrix for a finite projective plane of order 
n, then the rows of A comprise an (n 2 + n + 1 , w 2 + n + 1 , 2m) binary code in which 
every codeword has weight n + 1. 

Recall from Section 6.3 that there exists a finite projective plane of order n if and 
only if there exists a family of n — 1 mutually orthogonal Latin squares of order n. 
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Because such families are known to exist when n is a power of a prime, Theorem 

6.4.3 establishes the existence, e.g., of codes with parameters (73,73,16) and 
(91,91, 18), corresponding to n = 8 and n = 9, respectively. What about n = 10? 
The first pair of orthogonal Latin squares of order 10 was not discovered until 
1959. How does one go about finding nine of them? Computers? 

That finite projective planes have applications to coding theory is already 
obvious from Theorem 6.4.3. Less obvious is that this is a two-way street. During 
the 1970s and 1980s it was shown that a code, exhibiting all of the interesting prop- 
erties associated with a finite projective plane of order 10, could not exist! In fact, 
The only known proof of the nonexistence of a family of nine mutually orthogonal 
Latin squares of order 10 depends on the theory of error-correcting codes!^ 

The discussion leading up to Theorem 6.4.3 suggests that abstracting certain fea- 
tures of finite projective planes to a more general setting might be an easy way to 
produce binary codes with large error-correcting capabilities. 

6.4.4 Definition. Let V be a set with v elements called points} Suppose 
{B l ,B 2 , ■ ■ ■ ,B h } is a family of A:-element subsets of V called blocks. If each pair 
of distinct points of V occurs together in exactly X blocks, then S> = {#i , 
Z?2, • • • , Bi,} is a balanced incomplete block design (BIBD) with parameters (v, k, X). 

To avoid trivial cases, we will assume, throughout this section, that all designs 
satisfy v > k > 1. By a (v, k, X,) -design, we mean a BIBD with parameters (v, k, X). 

6.4.5 Example. Given a finite projective plane of order n, let V be its set of 
points and 3) its set of lines interpreted as subsets of V. Then 3> is a balanced 
incomplete block design with parameters v = b = n 2 + n+l, k = n+l, and 
X = 1 . A less exotic (and less interesting) example is the family of all ^-element 
subsets of V, a BIBD in which X = C(v — 2,k — 2). □ 

In a finite projective plane, not only is each line incident with n + 1 points, but 
each point is incident with n + 1 lines. In a balanced incomplete block design, each 
block contains k points and, while it may not be the case that each point is con- 
tained in k blocks, each point is contained in the same number q of blocks. 8 

6.4.6 Theorem. Each point of a (v,k,X)-design belongs to exactly 

? = X^| (6.24) 

blocks. 

*E. T. Parker, Orthogonal Latin squares, Proc. Nat. Acad. Sci. (USA) 45 (1959), 859-862. 

tit is still an open problem to determine the size of a largest family of mutually orthogonal Latin squares 

of order 10. 

'Reflecting the origins of this notion in the design of statistical experiments, the elements of V are also 
known as varieties. 

8 The usual notation for this parameter is not q, but r, a letter made unavailable here by our focus on r- 
error-correcting codes. 
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Proof. Let V = {Pi , P 2 , . . . , P v } be the set of points. Suppose P G V is fixed but 
arbitrary. The theorem is proved by counting, in two different ways, the number of 
times P is paired with another point in some block of the design. 

By renumbering the points, if necessary, we can assume P = P\ . By definition, 
Pi and Pj occur together in exactly X blocks, 2 < j < v. Thus X(v — 1) is one way 
to express the total number of pairings (multiplicities included) that involve Pi . On 
the other hand, Pi is paired with the remaining k — 1 points in each block to which 
it belongs. If Pi is contained in (exactly) q blocks, then the number of pairings that 
involve Pi is q(k — 1). Thus, q(k — 1) = X(v — 1). ■ 

Consider a BIBD S> = {Si, B 2 , . . . ,B b ] with point set V = {Pi, P 2 , . . . , P„} and 
parameters (v, k, X). Let A — (ay) be the v x b incidence matrix for the design, i.e., 



Evidently, each row of A contains q ones, and there are k ones in each of its col- 
umns. Counting the total number of ones, first by columns and then by rows, yields 
the identity bk = vq, i.e., 



Together, Equations (6.24) and (6.25a) imply that 



Because they are functions of v, k, and X, the numbers q and b will be referred to 
as dependent parameters. 

6.4.7 Corollary. Let A be the v x b incidence matrix of a (v, k, X)-design. If^ is 
the (n, M, d), r-error-correcting code comprised of the rows of A, then n = b, 
M = v, d = 2(q — X), r = q — X — 1, and wt(c) = q for all c£^. 

Proof. At this point, the only conclusion requiring proof is the value of d. If 
1 < ' < J < v, then, by Definition 6.4.4, the q ones in row i of A overlap the 
q ones in row j of A in exactly X places. Therefore, the distance between the 
corresponding codewords is (q — X) + (q — X) . ■ 

6.4.8 Example. Let V={P 1 ,P 2 ,P 3 } be a set of points. If B x ={P 1 ,P 2 }, 
Bi = {^2,^3}, and B 3 = {Pi,P 3 }, then @ x = {B U B 2 ,B 3 } is BIBD with para- 
meters (v, k, X) = (3,2, 1), and dependent parameters b = 3 and q = X(v — 1)/ 
(k — 1) = 2. Because d = 2(q - X) = 2, the rows of the incidence matrix 





(6.25a) 
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Figure 6.4.2. Incidence matrix for a (9, 3, l)-design. 



comprise a (3,3,2) binary code of constant weight 2. (Confirm the parameters of 
this code directly from the rows of A\.) It is not a very useful code for a variety of 
reasons, not the least of which is that r = q — X — 1=0. This code cannot correct 
even a single transmission error. 

If B 4 = B u B 5 = B 2 , and B 6 = B 3 , then & 2 = {Bi,B 2 , .. . ,B 6 } is a BIBD with 
parameters (v, k, X) — (3, 2, 2). This time, b = 6, q = 4, d = 4, and r = 1. Thus, the 
rows of the incidence matrix 

(I 0 1 1 0 1\ 
A 2 = 1 1 0 1 1 0 
\0 1 1 0 1 1/ 

comprise a (6,3,4), one-error-correcting (repetition) code of constant weight 4. 
(Confirm it.) □ 

6.4.9 Example. Let V = {1,2, ... ,9}. If Bi = {1,2,3}, B 2 = {1,4,7}, B 3 = 
{1,5,9}, B A = {1,6,8}, B 5 = {2,4,9}, B 6 = {2,5,8}, B 7 = {2,6,7}, B, = 
{3,4,8}, B 9 = {3,5,7}, B 10 = {3,6,9}, B„ = {4,5,6}, and B n = {7,8,9}, 
then ® = {Z?i , B 2 , . . . , B\ 2 } is a balanced incomplete block design with parameters 
(v, k, X) = (9, 3, 1). The dependent parameters are b = 12 and g = ^(v — 1)/ 
(& — 1) = 4. If A is the incidence matrix for this design (exhibited in Fig. 6.4.2), 
then the rows of A comprise a (12,9,6), two-error-correcting code of constant 
weight 4. 

Because X = 1, any given pair of points is contained in exactly one block. There- 
fore, any pair of distinct blocks can intersect in at most one point. This implies that 
the l's in two different columns of A can overlap in at most one place, i.e., the 
(Hamming) distance between columns is not less than 2(k — 1) = 4. Hence, the 
columns of A comprise a (9, 12,4), one-error-correcting binary code of constant 
weight k = 3. □ 

6.4.10 Definition. A balanced incomplete block design is symmetric if v = b, 
i.e., if the number of points is equal to the number of blocks. 
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Note that every finite projective plane affords a symmetric BIBD. The design 
from Example 6.4.8, is another. If A = (ay) is the incidence matrix of a sym- 
metric BIBD, then A must be square, but it need not be symmetric. Despite the 
name, there is no requirement that ay be equal to a,-,-. 

If b = v then, from Equation (6.25a), q — k, i.e., if A — (ay) is the v x v incidence 
matrix of a symmetric BIBD, then A has exactly k ones in each row and column. In 
particular, if 1 < s < t < v, then the scalar (dot) product of rows s and t of A is 



Because this is precisely the (s,t) -entry of the product of A and its transpose, the 
identity can be expressed more concisely as 



where J v is the v x v matrix each of whose entries is 1. The marvelous thing about 
this necessary condition for A to be the incidence matrix of a symmetric BIBD is 
that it is also sufficient. 

6.4.11 Lemma. Let A be a v x v (0, \)-matrix. Then A satisfies Equation (6.26) 
if and only if it is the incidence matrix for a symmetric (v,k,X) -design. 

The proof of sufficiency is left to the exercises. 

Among the more surprising consequences of Equation (6.26) is the following: 

6.4.12 Bruck-Ryser-Chowla Theorem (Part 1).* Consider a symmetric 
balanced incomplete block design with parameters (v, k, X), If v is even, then 
k — X is a perfect square. 

Proof. Let A be the v x v incidence matrix for the design. By Equation (6.26), 




AA' = 



(k-X)I v + XJ l 



V; 



(6.26) 



AA' = 



/ k X X 
X k X 
X X k 




\X X X 



kj 



Subtracting the first row of AA' from each of its remaining rows gives 



B = 



/ k X 
X — k k — X 
X-k 0 



X X 
0 0 
k-X 0 



X X 



0 
0 



\X-k 0 



0 



0 



k-X) 



'First proved for finite projective planes by R. H. Bruck and H. J. Ryser in 1949, the general theorem was 
published by S. Chowla and H. J. Ryser in 1950. 
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Adding columns 2 through v of matrix B to column 1 produces 



IX 


X 


X 


X ■ 


• x \ 


0 


k-X 


0 


0 • 


0 


0 


0 


k-X 


0 • 


0 


Vo 


0 


0 


0 • 


• k-X) 



where x = k + X(v — 1). Therefore, 

(det(A)) 2 = det(AA l ) 
= det(fi) 
= det(C) 

= [k + X{v- l)](k-X) v -\ (6.27) 

Because q = k we have, from Equation (6.24), that k(k — 1) = X(v — 1). Therefore, 
£ + ^(v — 1) = k 2 . Together with Equation (6.27), this identity implies that one fac- 
tor of (det(A)) 2 is a perfect square. Hence, the other factor, (k — X) v ~ l , must be a 
square as well. Because v — 1 is odd, this is possible only if k — X is a square. 

■ 

6.4.13 Example. Is there a symmetric BIBD with parameters (46, 10, 2)? When 
b = v (so that g = Equation (6.24) becomes k(k — 1) = ^(v — 1), a necessary 
condition that is satisfied for k = \Q,X = 2, and v = 46. On the other hand, because 
v = 46 is even, but k — 2 = 8 is not a perfect square, the existence of a symmetric 
(46, 10,2) design is precluded by Theorem 6.4.12. □ 

Let be the r-error-correcting code comprised of the rows of the incidence 
matrix of a symmetric balanced incomplete block design. If ^ has an even number 
of codewords then, from Corollary 6.4.7 and Theorem 6.4.12, r+ 1 is a perfect 
square. How interesting is that? If, e.g., A is the incidence matrix for a finite pro- 
jective plane of order «, then v = n 2 + n + 1 is odd. If A = A\ in Example 6.4.8, 
then v = 3 is odd. Are there, in fact, any symmetric BIBDs for which v is even? 
For that matter, does a nontrivial symmetric BIBD even exist? 

6.4.14 Example. If A is the (0, l)-matrix exhibited in Fig. 6.4.3, then computa- 
tions show (confirm them, at least for a few entries) that AA l = AI\(, + 2Ji^. By 
Lemma 6.4.11, this means A is the incidence matrix for a symmetric BIBD with 
parameters (16,6,2). If is the (n,M,d) r-error-correcting code comprised of 
the rows of A then, by Corollary 6.4.7, n = b = v = 16, M = v = 16, 



*For the purposes of this question, a symmetric BIBD is nontrivial if it has more than three points and does 
not correspond to a projective plane. 
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A = 
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Figure 6.4.3 



d = 2(q — X) = 8, and r = 3. In particular, as guaranteed by Corollary 6.4.7 and 
Theorem 6.4.12, r + 1 = 4 is a perfect square. □ 

Okay. Example 6.4.14 establishes the existence of a nontrivial symmetric BIBD. 
Are there more? Yes. In fact, we can systematically produce as many as we like. 
Here's how. 

6.4.15 Definition. Let H be an n x n matrix, each of whose entries is either +1 
or -1. If 

HH l = nl n , 
then H is a Hadamard matrix of order n. 

Note that HH l = nl n , if and only if H 1 = (l/n)H l , if and only if H l H = nl„. 

If all the entries in some row or column of a Hadamard matrix are multiplied by 
— 1, the result is another Hadamard matrix. Thus, any Hadamard matrix can be 
transformed into a normalized Hadamard matrix, one whose first row and column 
consist entirely of +l's. 

6.4.16 Example. The unique normalized Hadamard matrices of orders 1 and 2 
are 



(1) and 




respectively. Before reading on, take a moment to convince yourself that there is no 
Hadamard matrix of order 3. 
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When n = 4, there are (at least) two normalized Hadamard matrices, namely, 



Hi 



(\ 
1 
1 



1 

-1 
1 

-1 



1 
1 

-1 
-1 



1\ 

-1 
-1 

1/ 



and 



H 7 



(\ 
1 
1 



1 
1 

-1 
-1 



1 

-1 
1 

-1 



1\ 

-1 
-1 

1/ 



Observe that Hi can be obtained from Hi by interchanging its second and third 
columns — an elementary column operation. In other words, H2 = H\P, where the 
permutation matrix 



/l 0 0 0\ 

0 0 10 

0 10 0 

\0 0 0 1/ 



More generally, if P is a fixed but arbitrary permutation matrix of size n then, 
because P^ 1 = P l , the n x n (+1, — l)-matrix H is a Hadamard matrix if and only if 
K = HP is a Hadamard matrix. □ 

The equation HH 1 = nl n implies that any two different rows of H are orthogonal, 
not in the sense of orthogonal Latin squares, but in the sense that their scalar pro- 
duct (over U) is zero. In particular, if H is normalized, then every row but the first 
must contain the same number of + l's and — 1 's. If H is a Hadamard matrix of 
order n > 1 then, evidently, n must be even. There is more. 

If Hi is a normalized Hadamard matrix of order n > 2 then, by permuting the 
columns of Hi , if necessary, we can obtain a normalized Hadamard matrix H2 such 
that the first n/2 entries in the second row of H2 all equal +1. (See, e.g., Hi and H2 
in Example 6.4.16.) For a fixed but arbitrary row index i > 2, let t be the number of 
+ l's in the first n/2 columns of row i of H2- If s — (n/2) — t, then, among the first 
n/2 columns of the ith row of H2, there must be a total of s —l's. Moreover, by 
orthogonality with row 1, there must be s occurrences of +1 and t occurrences 
of —1 among the last n/2 entries of row i. In particular, 

2s + 2t = n. 

Finally, the orthogonality of rows 2 and i yields 

2t-2s = 0. 

Therefore, s = t and At = n. Let's formalize this last observation. 

6.4.17 Theorem. IfH is a Hadamard matrix of order n > 2, then n is an integer 
multiple of 4. 
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What does any of this have to do with symmetric designs? Suppose H is a nor- 
malized Hadamard matrix of order n = At > 8. Delete its first row and column and 
replace the — l's in the resulting matrix with 0's. This produces a square (0, 1)- 
matrix A, of order v = At — 1, with exactly 2? zeros in each row and column. More- 
over, by the orthogonality of the rows of H, 

AA l — tl v + (t — l)J v - (6.28) 

Thus (by Lemma 6.4. 1 1), A is the incidence matrix of a symmetric balanced incom- 
plete block design, the parameters of which are v = At — 1, X = t — 1, and 
k = t + X = 2t-\. 

Conversely, suppose A is a v x b incidence matrix for some BIBD S> having 
parameters {At — 1,2* — l,t— 1), where t > 2. Then, from Equation (6.25b), 

b= {^-mt-i) 

( >{2t- l)(2f-2) 
= 4t-l 



so 2) is symmetric. 

Let H be the matrix obtained from A by changing all of its zeros to —l's and 
adding a new first row and column consisting entirely of +l's. Then H is a 
At x At (+1, — 1) -matrix, with exactly 2t ones in each row but the first. In particular, 
row 1 of H is orthogonal to each of rows 2 through At. 

Suppose i and m are fixed but arbitrary integers satisfying 1 < i < m < At. 
Because 2 is symmetric, b = k = 2t — 1. Because 92 is a design, the 2t — 1 ones 
in the (z — l)st row of A overlap the 2t — 1 ones in its (m — 1) st row in exactly 
X = t — 1 places. Therefore, the 2? ones in row i of H overlap the 2t ones in 
row m of H in exactly r places. Because the remaining 2f entries in each of these 
rows of H all equal — 1, it follows that the scalar product of rows i and m of H is 0. 
Because i and m were arbitrary, HH l = Atl/s,,, i.e., // is a Hadamard matrix of order 
n = At. Let's summarize. 

6.4.18 Definition. Suppose 2> is a balanced incomplete block design with an 
incidence matrix that can be obtained from a normalized Hadamard matrix of order 
n = At > 8 by changing its —l's to 0's, and deleting its first row and column. Then 
Q) is a Hadamard design of order At — 1. 

6.4.19 Theorem. A balanced incomplete block design is a Hadamard design if 
and only if its parameters are {At — 1,2*— 1 , * — 1 ) for some t > 2. 

6.4.20 Example. When t = 2, the parameters from Theorem 6.4.19 are (7,3,1). 
Evidently, the projective plane of order 2 affords a Hadamard design! Let's find the 
corresponding Hadamard matrix. 
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With respect to the model described in Example 6.3.15, with point set 
V = {1,2,..., 7}, the blocks are Bi = {1,2,3}, B 2 = {1,4,7}, B 3 = {1,5,6}, 
B 4 = {2, 4, 6}, B 5 = {2, 5, 7}, B 6 = {3,4, 5}, and B 7 = {3, 6, 7}. Therefore (check 
it), the incidence matrix is 



(1 


i 


1 


0 


0 


0 


o\ 


1 


0 


0 


1 


1 


0 


0 


1 


0 


0 


0 


0 


1 
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0 
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0 
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0 


0 


0 


1 


0 


1 


1 


0 


0 


0 


1 


1 


0 


0 


1 


Vo 


1 


0 


0 


1 


0 


1/ 



Thus (from the discussion leading up to Definition 6.4.18 and Theorem 6.4.19), the 
normalized Hadamard matrix yielding this design is (check it) 



H 



□ 



It has been conjectured that Hadamard matrices of order At exist for every inte- 
ger t > 1. However, the fact that there are infinitely many Hadamard designs does 
not depend on the validity of this conjecture. 

6.4.21 Theorem. For any nonnegative integer k, there exists a Hadamard 
matrix of order 2 k . 



The proof is left to the exercises. 



6.4. EXERCISES 

1 Let F\,F2,- ■ ■ ,Fg be the faces of a cube, and B\,B2,- ■ ■ ,B% its vertices, 
interpreted as three-element subsets of V = {F\,F2, ■ ■ ■ ,F^}. Prove or disprove 
that 2 = {Bi , B 2 , ■ ■ ■ , B & } is a BIBD. 

2 Suppose P and Q are two (different) points of a (v, k, ^-design. 

(a) How many blocks of the design contain P or Ql 

(b) How many blocks of the design contain P or Q but not both? 
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3 Prove that there is no BIBD with parameters 
(a) (9,4,2). (b) (10,4,3). (c) (22,7,2). 

4 Prove that a BIBD whose parameters satisfy X — k(k — l)/(v — 1) is neces- 
sarily symmetric. 

5 If 3 = {B\ , Z?2, • • ■ , Bt} is a (v, k, ^)-design with point set V, its complement 
is3 c = {V\B u V\B 2 ,...,V\B b }. 

(a) If k < v - 1, prove that 3 C is a BIBD. 

(b) What are the parameters of 3 C ? 

(c) Describe how to obtain the incidence matrix for 3 C from the incidence 
matrix for 2. 

(d) Describe a (7, 4, 2)-design. 

(e) Describe a (9, 6, 5)-design. 

6 Prove that the dependent parameter q > X for any BIBD 3. 

7 Let A be the incidence matrix of a (v,k, ^)-design 3. 

(a) Show that AA l = (q - X)I V + XJ V . 

(b) Show that det(AA<) = qk(q - X) v ' ] . 

(c) Show that det(AA') > 0. 

(d) Prove that b > v. 

(e) Prove that k < q. 

8 Let A be the incidence matrix of a (v, k, ^)-design 3> = {B t , B 2 , ■ ■ ■ , Bf,}. 

(a) Show that [A'A]^. = o(B, n Bj). 

(b) Show that A 1 A need not equal (k - X)I b + XJ b . 

(c) Show that A l A = AA l if and only if S> is symmetric. 

(d) Show that det(A'A) ^ 0 if and only if Qi is symmetric. (Hint: 
Exercise 7(d).) 

(e) If 3) is symmetric, prove that A 1 is the incidence matrix of a symmetric 
BIBD.* 

(f) If Qs is symmetric, prove that any two blocks of 3 have exactly X points in 
common. 

9 Describe how you might construct a BIBD with parameters 

(a) (7,3,2). (b) (9,3,2). (c) (9,3,50). 

10 A Steiner sytem~ with parameters (f , k, v) is a set V with v elements called 
points and a family of distinct fe-element subsets of V called blocks, with the 

*The design @ d , with incidence matrix A 1 , is the dual of 9 in the sense that the points (blocks) of ® d 
correspond to the blocks (points) of S>. 

t Named for Jakob Steiner, but previously studied by Thomas Kirkman [On a problem in combinations, 
Cambridge & Dublin Math. J. 2 (1847), 191-204], these objects are sometimes called Steiner triple systems. 
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X = 
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Figure 6.4.4 



property that each f-element subset of V is contained in exactly one of the 
blocks. 

(a) Exhibit a Steiner system with parameters (2,3,7). 

(b) Show that every finite projective plane in a Steiner system. 

(c) Show that a Steiner system with parameters (2, k, v) is a balanced 
incomplete block design. 

(d) Let A be the incidence matrix for a Steiner system with parameters 
(f, k, v). Show that the columns of A comprise an (n,M,d) binary code 
where n = v,M = C(v,t)/C(k,t), and d > 2(k - t) + 1. 

11 From Exercise 28, Section 6.2, the extended Golay code ^ 2 4 is a (24, 4096, 8) 
linear code generated by the matrix G — (I\2\X), where X is the symmetric 
12x12 matrix in Fig. 6.4.4. Let A be the 11x11 matrix obtained from X by 
deleting its last row and column. 

(a) Show that A is the incidence matrix for a symmetric (11, 6, 3) -design. 

(b) Is the design in part (a) a Hadamard design? (Justify your answer.) 

12 Prove the sufficiency part of Lemma 6.4.11. 

13 Let /„ be the n x n matrix each of whose entries is 1. Then J n is a rank 1 
matrix whose only nonzero eigenvalue is equal to n. 

(a) Use this observation to prove that the eigenvalues of (k — X)I V + XJ V are 
k — X with multiplicity v — 1, and k + X(v — 1) with multiplicity 1. 

(b) Use part (a) to give an eigenvalue proof of Equation (6.27). 

14 Let H be a Hadamard matrix of order n. Prove that 
(a) —H is a Hadamard matrix of order n. 

C 5 ) ( u u ) i s a Hadamard matrix of order 2n. 
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15 Suppose H is a Hadamard matrix of order n. Prove that | det(//)| = n"l 2 . 

16 Prove that the projective plane of order 3 does not afford a Hadamard design. 

17 Explain why the symmetric design afforded by a projective plane of order n 
cannot be a Hadamard design for any n > 2. 

18 If A\ and A 2 are m x m matrices and B\ and B 2 are n x n matrices, then the 
Kronecker product (described in Exercise 17, Section 6.3) satisfies 
(Aj ®B X ){A 2 ®B 2 ) = {A l A 2 )®{B l B 2 ). 

(a) Use this property to prove that the Kronecker product of two Hadamard 
matrices is a Hadamard matrix. 

(b) Prove Theorem 6.4.21. 

19 Describe how to construct a symmetric BIBD with parameters 
(a) (15,7,3). (b) (31,15,7). (c) (31,16,8). 

20 Expanding on the Kronecker product technique of Exercise 18, J. Williamson 
proved the following: Let p be an odd prime. Suppose s is a positive integer 
such that p s — 1 is a multiple of 4. If there is a Hadamard matrix of order 
m > 1, then there is a Hadamard matrix of order m(p s +1). Use Williamson's 
theorem to prove the existence of a Hadamard matrix of order 

(a) 12. (b) 24. (c) 28. (d) 52. 

21 The normalized Hadamard matrices in Example 6.4.16 are all symmetric (i.e., 
H l = H.) 

(a) Find a nonsymmetric normalized Hadamard matrix of order 4. 

(b) Prove that there exists a symmetric Hadamard matrix of order 2 k for every 
k > 0. 

(c) Matrix A is said to be skew symmetric if A' = —A. Prove that there are no 
skew- symmetric Hadamard matrices. 

22 A Hadamard matrix H is said to be of skew type* if H = I + S, where S is skew 
symmetric, i.e., S l = —S. Exhibit a skew-type Hadamard matrix 

(a) of order 2? (b) of order 4? 

23 Let § be a Hadamard design of order At — 1. Let ( £ be the binary code 
comprised of the rows of an incidence matrix for 3>. Prove that c £ is a 
(4f- l,4f- 1,2?) code. 

24 Confirm that HH l = 8/ 8 for the matrix H in Example 6.4.20. 

Tt is known that if there is a skew type Hadamard matrix of order n, then there exists a Hadamard matrix 
of order n(n - 1); and if there is a skew type Hadamard matrix of order n and a symmetric Hadamard 
matrix of order n + 4, then there exists a Hadamard matrix of order n(n + 3). 
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25 The complement of a binary word w is the word w* obtained from w by 
changing all of its O's to l's and all of its l's to O's. For any binary code 
define f^jc*:^ <€}. 

Suppose H is the Hadamard matrix constructed in Example 6.4.20. Let ^ 
be the (8,8,4) binary code obtained from the rows of H by replacing the — l's 
with O's. 

(a) Show that T is an (8,8,4) binary code. 

(b) Show that <<? U T is an (8,16,4) binary code. 

26 Part 1 of the Bruck-Ryser-Chowla theorem (Theorem 6.4.12) gives a 
necessary condition for (v, k, X) to be the triple of parameters for a symmetric 
BIBD when v is even. Part 2 gives a necessary condition when v is odd, 
namely, that there exist integers x,y,z, not all zero, such that 

z 2 = {k-X)x 2 + (~\f-^\y\ 

(a) Show that the Bruck-Ryser-Chowla equation for a projective plane of 
order 10 is y 2 + z 2 = 10x 2 . 

(b) Find positive integers x,y, and z that solve the equation y 2 + z 2 = lOx 2 . 

(c) Show that the Bruck-Ryser-Chowla condition for the existence of a 
Hadamard matrix of order At > 8 is that there is a solution in integers 
x,y,z, not all zero, of the equation (t — l)v 2 + z 2 = fx 2 . 

(d) Show that there exist positive integer solutions x,y,z of the equation 

(r-l)y 2 + z 2 = /x 2 ,f >2. 

27 Let H be a normalized Hadamard matrix of order n. Let K be the (n— 1)- 
square submatrix of H obtained by deleting its first row and column, i.e., 

(\ 1 1 ... 1^ 
1 

1 K 



H 



Vl / 



Let /„_! be the (n — 1) -square matrix each of whose entries is +1. 

(a) Show that K l K = KK 1 = n/„_i - J„-i. 

(b) Show that /T 1 = (l/n)(^ - J„-i). 

(c) Prove Equation (6.28). 

28 Prove or disprove that in every Hadamard matrix of order n > 1, half the 
entries are +l's and half are —l's. 
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The purpose of this appendix is to prove two results from Section 1.9, beginning 
with the following. 

1.9.14 Newton's Identities. For a fixed but arbitrary positive integer n, let 
M r = M r (xi,X2, ■ ■ ■ ,x„) and E r = E r (x\,x 2 , ■ ■ ■ ,x„). Then, for all t > 1, 

M, — M t -\E\ + M t _ 2 E 2 + + {-l)'tE t = 0. (Al) 

If t > n, Equation (Al) has the simpler form 

M, - M t _ x E x + M t _ 2 E 2 + {-l) n M t _ n E n = 0 (A2) 

(because, e.g., E,(x u x 2 , ...,x„)= 0). 

The mathematics behind the proof is relatively simple, involving the product rule 
for differentiation and the fact that if p(x) is a polynomial of degree n > 1, and c is 
a constant, then there exists a unique polynomial q(x) such that 

p(x) = (x-c)q(x)+p(c). (A3) 

Al.l Example. Suppose p(x) — x 4 + x 3 — 6x 2 — 2x + 9 and c = —3. Dividing 
p(x) by (x + 3) yields the quotient q(x) = x 3 — 2x 2 — 2, and the remainder 
p(-3) = 15. (Confirm that 

x 4 + x 3 - 6x 2 - 2x + 9 = (x + 3)(x 3 - 2x 2 - 2) + 15.) 

If p(x) = x 4 + 3x 3 — 2x 2 — 4x + 6 and c = —3, then p{— 3) = 0. In this case, 
(x+3) is a factor of p(x). The other factor is the quotient q(x)—p(x)/ 
(x + 3) = x 3 - 2x + 2. (Confirm that p(x) = (x + 3) (x 3 - 2x + 2) .) □ 



Combinatorics, Second Edition, by Russell Merris. 
ISBN 0-471-26296-X © 2003 John Wiley & Sons, Inc. 
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Proof of Newton's Identities. As a polynomial identity, Equation (Al) can be 
proved by showing it to be valid for all possible substitutions for the variables. 
For fixed but arbitrary numbers a\, a 2 , . . . , a n , define M r = M r (a\, a2, ■ . . , a n ), 
E r = E r (a u a 2 , . . .,a„), and 

p(x) = (x- a\)(x - a 2 ) ■ ■ ■ {x - a n ) 

= x n - E lX "- 1 + E 2 x"- 2 -■■■ + {-l) n E n . 

If c is a constant then, as in Equation (A3), 

p{x) = {x-c)q{x)+p{c), (A4) 

where 

p(c) =c n - E^"- 1 + E 2 c"- 2 -■■■ + (-l) n E n . (A5) 
Because p(a,) = 0, substituting c = a, in Equation (A5) yields 

0 = < - E x a n r x + E 2 a1- 2 -■■■ + {-\) n E n . (A6) 
Multiplying both sides of Equation (A6) by a'r" and summing on i yields 

0 = M„ - E\M n _\ + E 2 M n _ 2 + {-\) n nE n 

when t = n, and 

0 = M, - M t _\E\ + M t _ 2 E 2 + (-l) n M t _ n E n 

when t > n. 

When t < n, things are a bit more complicated. Here, we need an explicit for- 
mula, not for p(c), but for 

q(x) = x"- 1 + (c - E x )x"- 2 + (c 2 - Eye + E 2 )x"- 3 + (c 3 - E { c 2 + E 2 c - E 3 )x"- 4 
+ ■■■ + (c"- 1 - E lC "- 2 +■■■ + (-l)"" 1 ^.!). 

(Confirm that this is q(x) in Equation (A4).) 

Substituting c = a,, not in Equation (A5), but in Equation (A4), we obtain, after 
canceling (x — a,) from both sides, that 

(x — a\) ■ ■ ■ (x — a;_i)(x — fl,- + i) ■ ■ ■ (x — a n ) 
= x"- 1 + (en - E x )x"- 2 + (a 2 - Ekh + E 2 )x"- 3 
+ [a] -E x a 2 + E 2ai - E 3 )x"- 4 + ■■■ 

+ {al' 1 - E x a n r 2 + ■■■ + (-1)"" 1 . (A7) 
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Because a \ — M r , summing the right-hand side of Equation (A7) yields 

nx"' 1 + (Mi - nE x )x"- 2 + (M 2 - E X M X + nE 2 )x n ~ i 
+ (M 3 - E X M 2 + E 2 M X - nE 3 )x"- 4 + ■■■ 

+ (M„_! - E x M n _ 2 + ■■■ + {-l)"- 1 nE n _ x ). (A8) 

By the product rule from calculus, the sum on the left-hand side of Equation (A7) is 
the derivative 

n 

P'( x ) = ^2( X - a >) ■■■( x ~ Oi-i){x ~ fli+i) ■■■(x- a„). 
1=1 

Another way to express the derivative of p(x) is 

nx n - 1 - [n - \)E x x n - 2 + {n- 2)E 2 x"- 3 

- (n - 3)E 3 x"- 4 +■■■ + {-\) n - l E n _ x . (A9) 

Comparing the coefficient of x k in Expressions (A8) and (A9), 0 < k < n — 1, 
yields 

-(«- l)Ei =Mi -nE u or 0 = M x - E u 

(n - 2)E 2 = M 2 - E X M X + nE 2 , or 0 = M 2 - E X M X + 2E 2 , 

-(n-3)E 3 =M 3 -E x M 2 + E 2 M x -nE i , or 0 = M 3 - E X M 2 + E 2 M X - 3E 3 , 

and so on until, finally, 

0 = M„_, - E x M n _ 2 + ■■■ + {-l)"- 2 E n _ 2 M x + {-\f-\n - l)E„_ u 

precisely Newton's identities when t < n. ■ 

We now come to the second objective of this appendix, a proof of the following 
result from Section 1.9. 

1.9.11 Theorem. Any polynomial, symmetric in the variables X\ i x 2 , . . . , x n is a 
polynomial in the power sums 

M t = M,(xi,x 2 , . . . ,x n ), \<t<n. 

Two proofs will be given. The first is a brute-force inductive proof. The second, 
while a little longer and subtler, is also more illuminating. Before giving either, 
we observe that Theorem 1.9.11 is equivalent to a classical result from 
nineteenth-century invariant theory. 
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A1.2 Fundamental Theorem of Symmetric Polynomials. Any polynomial, 
symmetric in the variables X\,x 2 , ... ,x n , is a polynomial in the elementary sym- 
metric functions E t = E t (x\,x 2 , ■ ■ ■ ,x n ), 1 < t < n. 

From Newton's identities, 

M t — M t -\E\ +M,- 2 E 2 + {-\)'-'M x E t ^ + (-l)'tE, = 0, 

where M 0 = E 0 = 1. Solving recursively for the power sums, we obtain 

Mi=E u 

M 2 = E\- 2E 2 , 

M 3 =E\ -3£i£ 2 + 3£ 3 , 

M 4 = E\ - AE\E 2 + AEiEj, + 2E\ - 4E 4 , 

and so on. For each t > 1, M, is a polynomial in E s , 1 < s < t. Therefore, the fun- 
damental theorem is a consequence of Theorem 1.9.11. To prove the converse, 
Newton's identities are solved recursively for the elementary symmetric functions: 

Ei =M U 

E 2 = \[M\-M 2 ], 

E 3 =±[M 3 l -3M l M 2 + 2M i }, 

E 4 = ^[M\ - 6M\M 2 + &MiM 3 + m\ - 6M 4 ], 

and so on. For each t > 1, E, is a polynomial in M„ 1 < s < t. Therefore, Theorem 
1.9.11 is a consequence of the fundamental theorem. 

Our first proof of Theorem 1.9.11 is achieved by proving the fundamental 
theorem. In order to do that, we need to introduce a natural ordering on the set 
of partitions of m. 

A1.3 Definition. Suppose a = [ai, <x 2 , . . . , ae] and (3 = [(3 1; P 2 > ■ ■ • > P. s ] two 
partitions of m. Then a comes after (3 in dictionary order, written a > (3, if 
oci > Pl or if a,- = p,-, 1 <i< j, and a, > p., for some positive integer j < I. 

For example, [6, l 2 ] > [5, 3] > [5, 2, 1] > [4 2 ]. A little less formally, a > P if a 
has the larger part in the first place where the partitions differ. If a, P h m, then 
a > P means a = P or a > p. 

Proof of the Fundamental Theorem of Symmetric Polynomials. Let / = 
/(xi,X2, . . . ,x„) be a symmetric polynomial. Write / =fo +f + ■ ■ ■ +fk, where 
fi =fi{xi,x 2 , . . . ,x n ) is the homogeneous part of /consisting of all terms of (total) 
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degree In particular, we are assuming that/, itself, is of degree k. Consider one of 
the monomial terms of fa, say 

ex?*? (A10) 

where r\ + r 2 + ■ ■ ■ + r„ = k. Because fa is symmetric, we may assume that 
n > r 2 > • • • > r e > 1 > r e+1 = ■ ■ ■ = r„ = 0. Let a = [n , r 2 , . . . , r e ] h k. 

Among all partitions of k that occur as the sequence of exponents of some mono- 
mial term of fa, suppose a is the largest (coming last) in dictionary order, meaning 
that r\ is the largest exponent to occur in any monomial term of fa; among all 
monomial terms oifa that have r\ as an exponent, r 2 is the maximum second largest 
exponent, and so on. 

Consider the product 

(All) 

where s\ > s 2 > • • • > s n . In dictionary order of the exponents, the last monomial 
term in Expression (All) is 

tf(x\x 2 ) S2 ■ ■ ■ (x\x 2 ■ ■ -x n ) s ". 

In order for this last term to equal x^x^ ■ ■ -x%, we need 

r\ = si +s 2 + 53 H \-s n , 

n = +s 2 + s 3 H hs„, 

f3 = s 3 -\ hs„, 

and so on, with ri = se H + s n . These equations are satisfied when 

se+\ = ■ ■ ■ = s n = 0, 

se = r e , 
se-i = re-\ - r ( , 
se-2 = re - 2 - ty-i, 

and so on, finally setting s\ =r\ — r 2 . 

With these choices for s t ,s 2 , ■ ■ ■ ,s n , fa — cE^E^ ■ ■ is either zero, or a 
homogeneous symmetric polynomial of degree k, in which every partition occurring 
among the exponents of its monomial terms is less than a (in dictionary order). 
Because dictionary order is a total order and p(k), the number of partitions of k, 
is finite, it follows by induction that the difference fa — cE s ^E 2 is a poly- 

nomial in the elementary symmetric functions. Hence, fa is a polynomial in the 
elementary symmetric functions and, by induction on k, so is /. ■ 
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For the purposes of the second proof, it will be useful to modify our usual nota- 
tion, replacing M, with P t . So, for the remainder of this appendix (only), 
P t = M t {x\ ,X2,...,x„) — x\ + x' 2 + ■ ■ ■ + x' n . If a = [ai, <X2, . . . , Cty] I - k, define 

Pa = Pa\Pa 2 ' ' ' Pa.,- (A12) 

Then, e.g., P p l2| = P3P1P1. If n = 3, then 

P[3,i*\ = (x 3 +y 3 +z 3 )(x + y + z) 2 . 

A product of symmetric polynomials, P a = P a {xi,X2, ■ . ■ ,x n ) is a symmetric poly- 
nomial in the variables xi,x 2 , . . . ,x n . 

Before getting to the second proof, we need to introduce another ordering of the 
partitions of m. 

A1.4 Definition. Suppose a = [<Xi, 0C2, . . . , tty] and (3 = [(3 1; p 2 , . . . , P.J are two 
partitions of m. Then a majorizes P, written a >- p, if t < s, and 

i=\ i=l 

If, e.g., a=[5,3]h8 and p=[3 2 ,2]h8 then a^p because 5>3 and 
5 + 3 > 3 + 3. On the other hand, neither a = [5, 3] nor P = [6, l 2 ] majorizes the 
other. Unlike dictionary order, in which every pair of partitions of m is comparable, 
majorization is a partial order. 

A1.5 Lemma. Suppose a, P h m. If a >- p, then a > p. 

Proof. If a >- p and a ^ P, then a must be larger in the first part where they 
differ. ■ 

Direct Proof of Theorem 1.9.11. If P h m then, from Theorem 1.8.15, Pp = 
Pp(x\,X2, . . . ,x n ) is a linear combination of minimal symmetric polynomials. In 
other words, there exist constants c a p, a h m, such that 

Pp = ^c ap M a , (A13) 

ahm 

where M a = M a (x\,X2, ■ ■ ■ ,x„) is the minimal symmetric polynomial correspond- 
ing to a. (Together with Equation (A12), this explains why it was necessary to 
replace M, with P t .) 
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A1.6 Lemma. In Equation (A13), the constants c a p satisfy 

(i) c m =^0, a h m; and 
(ii) c a p = 0 unless a >- (3. 

Lemma A1.6 all but finishes the second proof of Theorem 1.9.11. To see why, 
consider the p(m) x p(m) transition matrix C — (c a p) whose rows and columns are 
indexed by the partitions of m arranged in dictionary order. It follows from 
Lemmas A1.5 and A1.6 that C is a lower triangular matrix, none of whose diagonal 
entries is zero. In particular, C is invertible. Therefore, the minimal symmetric 
polynomials M a , ah m, are linear combinations of the power sum products 
Pp, P h m, i.e., M a is a polynomial in the power sums P t ,t> 1. 

In view of Theorem 1.8.15, this leaves us with the technical detail of showing, 
for a fixed but arbitrary P h m > n, that P$(x\,x 2 , ■ ■ . ,x„) is a polynomial in 
P t (x\,X2, ... ,x n ), t < n. This we prove by induction on j = m — n. 

By Equation (A2), for any m > n, 

P m = P m - X E X - P m _ 2 E 2 + (-l)"P m _„E„. (A14) 

Earlier in this appendix we used Newton's identities to show, for each i > 1, that £, 
is a polynomial in P,, 1 < t < n. Setting m = n + 1 in Equation (A14) establishes 
the basis (j = 1) step of the induction. Setting m = n+j finishes it. 

Proof of Lemma A1.6. Suppose a, P h m. If the monomial x^'x^ 2 ■ ■ ■ x* r appears in 

Pp = (*?' +x 2 P ' + • • • +xP.)(xf 2 +x\ 2 + • • ■ ■ ■ ($+4- + ■ ■ -+xl>), 

then {Pj , p 2 , . . . , p s } can be expressed as the disjoint union B\ U B 2 U • • • U Bi in 
such a way that a, is the sum of the elements of B h 1 < i < I. Since 
^i > QC2 > • • • > a/, and since p[ belongs to some B t , it must be that oti > Pj. 
Because {P 1; P 2 } belongs to the union of some pair of the B's, oti + a 2 > 
P! + P 2 , and so on. In other words, a >- P, establishing part (ii). Since 
x^x^ 2 ■ ■ -x*' appears in P a , part (i) is immediate. ■ 
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There are two ways of constructing a software design; one way is to make it so simple 
that there are obviously no deficiencies, and the other way is to make it so compli- 
cated that there are no obvious deficiencies. 

— C. A. R. Hoare 

The purpose of this appendix is to address the "sorting problem" raised in 
Section 1.10, i.e., to develop and discuss alogrithms to sort sets of numbers into 
numerical order and sets of words into dictionary order. 

Suppose you had a well shuffled deck of 3 x 5 cards, each with a single number 
on it. Suppose you had the job of designing an algorithm to sort the cards into non- 
decreasing numerical order. The best way to begin is probably to try to articulate 
how you would do the chore yourself, and then consider alternative approaches that 
might yeild a better step-by-step pocess. One possibility is to scan the cards for a 
smallest number, move it up to the front (top) of the deck, scan the remaining cards 
for a smallest number, move it up to the second place, and so on. Another 
possibility is to start a new deck with some arbitrary card, choose another card 
from those that remain in the old deck and insert it in the new deck at an appropriate 
place, pick a third card from the old deck and insert it in its proper place in the new 
deck, etc. Might one of these approaches yield a better algorithm than the other? 
One way to find out would be to try them, say, on a set consisting of 1000 numbers. 

This raises the tedious prospect of having to enter 1000 numbers into a 
computer. There is an alternative. Assuming the keyword RND returns a pseudo- 
random number from the interval (0,1), a subroutine to generate N pseudorandom 
integers from the interval [0, 999] follows: 

1 . Input N. 

2 . For 1=1 to N. 

3. R(I) = LlOOOxRNDJ. 

4. Next I. 

If a program implementing this subroutine were run several times, with the same 
N, many deskop computers would return the same N integers, in the same order! 
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That can be useful, e.g., when comparing different sorting algorithms. On the other 
hand, whenver it seems desirable, this default setting can be overridden by inserting 
(just once, at the beginning) a Randomize command. 

Given N numbers to sort, let's try to implement the first approach and scan them 
for the smallest number. How, exactly, might that be done? One way is to let 
X = R(l), then compare X with R(2). If R(2) is smaller than R(l), change the value 
of X to R(2). Otherwise keep its value equal to R(l). Then compare X with R(3), 
and so on. Eventually, after N — 1 comparisons, a smallest number is identified. 
Shifting the other numbers to make room for X at the front (top) of the deck 
requires knowledge, not only of the value of the smallest number, but also of its 
location in the deck. That's asking too much from a single memory location. We 
need one location, X, to keep track of the value of the number and another, /, to 
keep track of its location. 

Imagine, in the middle of this process, having (re)arranged things so that, of the 
original N numbers, the smallest C are R(l) < R(2) < ■ ■ ■ < R(C). The next step 
would be to scan for the smallest of the remaining numbers. This process might 
start like this: 

5. C= C+l. 

6. J= C. 

7 . X= R(C) . 

If (the new) C = N, the task is complete, and it would be time to proceed to the 
check-out line: 

8. If C = IVthen go to step 20 . 

20 . For I = 1 to N. 
21. Write R(I) . 
22 . Next I. 

Otherwise, start scanning: 

9. For 1 = C+ 1 to N. 

10. If X< R ( J) then go to step 13. 

11. J= I. 

12. X= R(J) . 

13 . Next I. 

At the completion of the loop in steps 9-13, the next smallest number will have 
been located in position J. If J is still (the new) C, it is already in its proper palce 
and we can return to step 5: 

14 . If J= C then go to step 5 . 
Otherwise, we need to reorganize the list: 

15 . For J = J to C+l. 
16. R( I) = R(I-l) . 
17 . Next I. 
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18. R( C) = X. 

19. Go to step 5 . 

Note, in step 15, that the value of / starts at J and works its way backward to C + 1. 

There is a way to have the computer time itself as it sorts. Leaving out the time it 
takes to generate the numbers to be sorted, this timekeeping chore can be accom- 
plished, symbolically, by adding the steps 

4.1. Start = Time. 
23. Write Time-Start . 

Finally, one might like to see the original list of unsorted numbers. This can be 
accomplished by adding step 

3.1. Write R(I) . 

Assembling these steps, in the proper order (and initializing C), we obtain the 
following: 

A2.1 (Smallest First) Sorting Algorithm 

1. Input N and set C = 0 . 

2 . For 1= 1 to N. 

3 . R ( T) = L1000 x RNDJ . 
3.1. Write R(I) . 

4 . Next I. 
4.1. Start-Time. 

5. C= C+l. 

6. J= C. 

7 . X = R(C) . 

8. If C = Wthen go to step 20 . 

9. For 1= C+ 1 to N. 

10. If X<R(I) then go to step 13. 

11. J= I. 

12. X= R(J) . 

13 . Next I. 

14 . If J= C then go to step 5 . 

15 . For I = Jto C+l. 
16. R(I) = R(I-l) . 
17 . Next I. 

18. R(C) = X. 

19 . Go to step 5 . 

20 . For 1=1 to N. 
21. write R(I) . 

22 . Next I. 

23 . Write Time - Start . 

The other possibility we had in mind was to fashion a new deck, a card at a time, 
by insertion. If A(l) < A(2) < • • • < A(C) are the numbers R(l),R(2), . . . ,R(C) 
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(re)arranged into nondecreasing order, then R(C + 1) can be inserted into its 
proper place among the A's using the following subroutine: 

A2.2 Algorithm 

I. For J= 1 to C. 

2 . If R (C+l) <A{ J) then go to step 6. 

3 . Next J. 

4. A(C+1) = R{C+1) . 
5 . Go to step 10 . 
6. For I = C to J. 
7 . 1) = A( I) . 

8 . Next I. 

9. A( J) = i?(C+ 1) . 

10. Return. □ 

Embedding this subroutine into a For . . . Next loop yields the following alternative 
to Algorithm A2.1: 

A2.3 (Insertion) Sorting Algorithm 

1. Input N. 

2 . For 1=1 to N. 

3 . R(I) = L1000 x RNDJ . 
3.1. Write R(I) . 

4 . Next I. 

4.1. Start = Time . 

5. A(l) = R(l) . 

6. For C= 1 to N- 1. 

7. Call Algorithm A2 . 2 . 

8. Next C. 

9. For 1=1 to N. 
10 . Write A{ I) . 

II. Next I. 

12. Write Time - Start . □ 

A2.4 Example. How does insertion sorting compare with smallest first sorting? 
To some extent, what will depend on programming language and machine 
architecture. Experiments on a Pentium-based PC show that where, on average, 
Algorithm A2.1 requires 10 units of time to sort 1000 numbers, Algorithm A2.3 
needs only 8 units. 

Given that insertion needs 8 (standardized) units of time to sort 1000 numbers, 
how long would you expect it to take to sort 5000 numbers? Experiments with the 
same PC show that it takes, not 40, but 196 units. It takes, not 5, but nearly 25 times 
as long! And, it is easy to see why. 

In discovering that R(C + 1) < A(J), the subroutine at the heart of Algorithm 
A2.3 needed to make J comparisons. Inserting R(C + 1) at the 7th place in the 



Sorting Algorithms 



489 



sequence of A's required C — J shifts, for a total of C operations. As C ranges from 
1 to N — 1, the total number of operations is 

1 + 2 + ■•■ + (# - 1) =±N(N- 1). 

Thus, the number of operations this algorithm uses to sort n numbers is on the order 
of n 2 . Algorithm A2.3 is 0(n 2 ). □ 

A2.5 Definition. Suppose /and g are real-valued functions defined on the set of 
positive integers. Then /(«) is 0(g(n)) if there exists a positive real number c and a 
nonnegative integer m such that |/(«)| < cg(n) for all n > m. 

This Big Oh notation should not be confused with o(S), which, in this book, 
denotes the cardinality of the set S. 

Assuming, for the sake of argument, that Example A2.4 is a convincing demon- 
stration that insertion is faster than smallest first sorting when N = 1000, might 
some third alternative be even faster? With a little fine-tuning, insertion itself 
can be speeded up considerably. 

Let's return to the point where R(C + 1) is being inserted into the ordered list of 
A's. In the worst case it will have to be compared with C numbers, namely, 
A(1),A(2), . . . ,A(C), before the correct insertion point is found. The same 
worst-case estimate applies if, instead of starting at A(l) and working up, we first 
compare R(C + 1) with A(C), then with A(C — 1), and so on, working down to 
A(l). But, if the first comparison is with a middle A, we could determine, in a single 
stroke, to which half 'of the list of A's the new entry belongs. If the number of pos- 
sible insertion points can be cut in half by each comparison, the worst case would 
go from C to log 2 (C) comparisons^ 

Using S for start, F for finish, M for middle, and T for temporary, here is a sub- 
routine to find the correct insertion point for R(C + 1): 

1.5=1 and F = C . 

2. T= F-S. 

3. [If Tis too small , do something else . ] 
4 . M= [T/2J . 

5. If R (C+ 1) <A(S + M) thenF=S+M. 

6. If R (C+ 1) >A(S + M) thenS=S+M. 
7 . Go to step 2 . 

A complete algorithm based on this subroutine might look something like the 
following: 

A2.6 (Fast Insertion) Sorting Algorithm 
1 . Input N. 

Elsewhere, little oh may be used in other ways. 
+ If C = 1000, then log 2 (C) < 10. 
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2 . For 1=1 to N. 

3. R (I) = LlOOOxRNDj. 
3.1. Write R(I). 

4. Next I 

4.1. Start = Time . 

5. A(l) = R(l) . 

6. If R(2) >R(1) then A(2) = R(2) . 

7. If R{2) < R{±) then A (1) = R(2) and A(2) = R(l) . 

8. C= 2. 

9. If C = Wthen stop. 
10 . S = 1 and F = C. 

11. T= F- S . 

12. If T< 4 then go to step 17 . 

13. M= LT/2J . 

14. If R ( C+ 1) <A(S+M) then F = S+M. 

15. If R ( C+ 1) >A(S+M) then S= S+M. 

16 . Go to step 11 . 

17 . For I = S to F. 

18. J= I. 

19. If R (C+l) <A( J) then go to step 25 . 

20 . Next I. 

21. J= F. 

22. If f< C then go to step 25 . 
23 . A (C+l) = R{C+ 1). 

24. Go to step 29. 
25 . For J = C to J. 
26. A ( J+ 1 ) = A ( J) . 
27 . Next I. 

28. A( J) = J?(C+1) . 

29. C= C+ 1. 

30 . Go to step 9 . 

31. For 1=1 to N. 

32. Write A ( J) . 

33 . Next I. 

34 . Write Time - Start . □ 

A2.7 Example. Nearly three times as long as insertion (Algorithm A2.3), fast 
insertion looks like something invented by a government bureaucrat! Nevertheless, 
in the language of Example A2.4, where smallest first requires, on average, 10 stan- 
dardized units of time to sort 1000 numbers, and insertion takes 8 units, fast inser- 
tion does the job in 4 units. In the more demanding test of sorting 5000 numbers, 
smallest first needs 243 units, insertion 196 units, and fast insertion 92. □ 



Now that we know something about sorting numbers, what about sorting sets of 
words into dictionary order? Conceptually, all that's needed is a function/, from the 
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set of words to the positive integers, with the property that W\ comes (strictly) 
before W2 in dictionary order if and only iff(Wi) <f(W2). Given such a function, 
it is easy to outline a sorting algorithm: 

1. Input the words. 

2. Use / to assign a number to each word. 

3. Sort the numbers. 

4. Apply / _1 to the sorted numbers. 

5. List the resulting words. 

One way to define such a function begins by assigning the numbers 1-26 to the 
letters A-Z, respectively, and then extending the definition to arbitrary words by 
defining 

m 

f(W) = x 27-'', (A15) 

i=\ 

where L\ , La, . . . , L m are the letters in W = L\La ■ ■ ■ L m * 

It is not difficult to see that/ is a one-to-one function and fhat/(Wi) <f(W2) if 
and only if W\ comes before W2 in dictionary order, provided W\ and W2 contain 
precisely the same number of letters. For words of different length, things can go 
wrong, e.g., ABC comes before D in dictionary order, but 

f(ABC) = 1 x 27 2 + 2 x 27 + 3 
= 786 
> 4 

= f(D). 

This difficulty can be circumvented by the introduction of an artificial letter, say @, 
defining /(@) = 0, and appending enough copies of @ to the end of the shorter 
word so that it becomes so long as the longer word, e.g., 

f(D@@) = 4 x 27 2 + 0 x 27 + 0 
= 2916 
> 786 
= f(ABC). 

To invert/, suppose TV =f(W). Dividing TV by 27 yields a quotient Q\ and a 
remainder R\ = N — 21 Q\. Because quotients and remainders are unique, it follows 

This approach is equivalent to viewing words as base 27 numerals. 
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from Equation (A15) that Ri —f(L m ). Dividing Q\ by 27 produces a new quotient 
Q2 and a new remainder R2 = /(L m _i), and so on. The numerical values of the let- 
ters comprising W are (reading from right to left) the remainders obtained when 
successive quotients are divided by 27. This reduces the problem of inverting / 
to finding f~ l (N),0 < N < 26. 

A2.8 (Successive Division by 27) Algorithm 

1. T= IV and 1=0. 

2. 1=1+1 and Q= \ T/21 \ . 

3 . R = T — 21Q. 

4. K T = f- 1 (R) . 

5. If Q = 0 then go to step 8 . 

6. T=Q. 

7 . Go to step 2 . 

8 . W = K x ■ ■ ■ K 2 K x . 

A2.9 Example. Zircon is a mineral whose appearance can vary from colorless to 
brown. When heated, cut, and polished, zircon yields a brilliant blue-white gem- 
stone. According to our scheme for assigning numbers to words, the numercial 
value of ZIRCON is 

/(ZIRCON) = 26 x 27 5 + 9 x 27 4 + 18 x 27 3 + 3 x 27 2 + 15 x 27 + 14 
= 378211451. 

when the successive-division-by-27 algorithm was executed on a (Pentium-based) 
desktop PC, it produced f~ x (/(ZIRCON)) = ZIRCOS. Because ZIRCON^ 
ZIRCOS, there is obviously an error somewhere. 

Unfortunately, "obviously an error" is not the same as "an obvious error." In 
this case, however, the source of the error is well known. It is due to round-off. 
Employing only its default accuracy, this computer confused /(ZIRCOS) = 
378211456 with /(ZIRCON) = 378211451. □ 

In principle, an algorithm to sort an arbitrary set of words into dictionary order is 
new at hand: 

1 . Input the number , N, of words to be ordered. 

2. Input the maximum word-length , M. 

3 . Input N words . 

4 . Attach @ ' s to the ends of words as needed. 

5 . To each word W, assign the number f(W). 

6. Sort the f(W)'s. 

7. Apply f' 1 to the sorted number s . 

8. List the resulting words ( suppressing the @ ' s ) . 

In view of Example A2.9, successfully implementing this algorithm as a 
computer program may not be straightforward. There is however a relatively 
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easy procedure that not so much solves as postpones the round-off error problem. 
Double precision is a phrase associated with extending the number of numerical 
digits carried by a computer. Using nothing more complicated than double preci- 
sion arithmetic, our main algorithm returned accurate results on a desktop PC for 
all M < 11, enough to accomodate words as long as MISSISSIPPI. For lists 
containing longer words, other programming techniques are required. 



EXERCISES A2 

1. An algorithm is 0(1) if its running time is independent of the size of the input. 
Design an algorithm to sum the first n positive integers 

(a) that is 0(n 2 ). 

(b) that is 0(1). 

2. Let a, be a real number, 0 < i < r. If a r =^ 0, show that f(n) = a r n r + 
a r _ x n r ~ x + h a 0 is 0(n r ). 

3. Show that 

(a) smallest first sorting (Algorithm A2.1) is 0(n 2 ). 

(b) any 0(n) algorithm is 0(n 2 ). 

4. Suppose f(n) and g(n) are 0(h(n)). Show that 

(a) f(n)+g(n) is 0{h(n)). 

(b) f(n)g(n) is 0(h(n) 2 ). 

5. For a variety of reasons, it is not uncommon to be given the task of merging two 
(or more) sorted lists. Write an algorithm to merge the following two lists into a 
single list (sorted in nondecreasing order): 

1,2,3,4,4,4,5,5,5,7 
2,3,3,4,5,5,6,6,8,9 

6. All three sorting algorithms in this appendix use a subroutine to generate N 
pseudorandom integers from the interval [0,999]. Here is a modification to 
generate 1000 pseudorandom integers from the interval [0,99]: 

1. For J= 1 to 1000. 

2. R ( I) = [100 x RNDJ . 
3 . Next I. 

To the extent that RND simulates a random-number generator, each integer in 
[0,99] ought to occur with equal likelihood. If, e.g., the subroutine were run 
several times we would expect, on average, the number 99 to occur 10 times. 
Modify one of the sorting algorithms in the text (or write your own) so that it 
generates and sorts 1000 pseudorandom integers between 0 and 99 (inclusive). 
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(a) Run your (modified) program 10 times (using 10 different randomizing 
"seeds") and record the number of times 99 occurs in each run. 

(b) Explain why it is helpful, in doing part (a), to sort the thousand integers 
before counting the occurrences of 99. 

Write an algorithm to input N, generates N pseudorandom birthdates (month 
and day, but not year; exclude February 29), and output the data sorted in 
increasing order of dates. 

Modify Exercise 7 so that, e.g., if "MAR 22" were to occur three times, 
instead of "MAR 22 MAR 22 MAR 22", the output for that date would be 
something like "MAR 22 (3)". 

Another idea for sorting n numbers might be called switch sorting. Succes- 
sively compare R(J) with R(J + 1). If R(J) > R(J +1), switch them. Other- 
wise leave them alone. Repeat this process as many times as necessary to sort 
the numbers. 

(a) Write an algorithm to implement switch sorting. 

(b) Show that switch sorting is (also!) 0(n 2 ). 

(c) Run your program from part (a) with n = 1000 and compare the sorting 
time with fast insertion. 

Restricting its domain to the set of words that can be assembled using (only) 
the 26 uppercase letters of the English alphabet, prove that the function 
defined by Equation (A15) is one-to-one. 
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Readers of this book are presumed to have been exposed to that part of elementary 
linear algebra commonly found among the lower division requirement for majors in 
the mathematical and computer sciences. The purpose of this appendix is to provide 
an informal reminder of these already familiar topics, to specify certain conventions 
of language, and to touch on one or two nonstandard topics that may be mentioned 
in the text but are not essential to understanding it. 

If v = (a.\,a.2, ■ ■ ■ ,a n ), its transpose, v\ is the n x 1 column vector whose ith 
entry is a, £ K,l < i < n, where K is the field of scalars. While the following dis- 
cussion focuses primarily on the field K = R, of real numbers, the techniques 
extend to, or have analogs for, other fields. The applications in Chapter 6, e.g., 
involve the scalar field F = {0, 1}, where arithmetic is Boolean. 

The homogeneous system of linear equations 

x\ + 2x2 + 3x4 + 3x 5 = 0 
x\ + 2x 2 + x 3 + 7x 4 + 4x 5 =0 (A16) 
2xi + 4x2 + 6x4 + 5x5 = 0 

is equivalent to the single matrix equation Ax = 0, where 0 is the 3 x 1 zero matrix, 
x is the 5 x 1 matrix with x, in its rth row, and the coefficient matrix is 



A = 



/l 2 0 3 3\ 

12 17 4 
\2 4 0 6 5/ 



(A17) 



The Gauss-Jordan elimination method for solving such systems employs elemen- 
tary row operations to transform. A to Hermite normal form (also called reduced 
row echelon form). For the matrix A in Equation (A17), subtracting row 1 from 
row 2, and twice row 1 from row 3 yields the row equivalent matrix 
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Adding row 3 of B to row 2, three times row 3 to row 1, and then multiplying row 3 
by —1, produces the Hermite normal form 

/l 2 0 3 0\ 
C= 0 0 1 4 0 1, 
\0 0 0 0 1/ 

in which the pivot entries (the leading l's in each row of C) are the only nonzero 
entries in their respective columns. 

One virtue of elementary row operations is that they leave the solution set 
unchanged, i.e., Ax = 0, Bx = 0, and Cx = 0 all have the same solution set. In 
particular, x solves Equations (A 16) if an only if 

x\ + 2x2 + 3x4 = 0 
X3 + 4x4 = 0 
x 5 = 0, 

if and only if 

X\ = — 2x2 — 3x4 

x 3 = -4x 4 (A18) 
x 5 = 0, 

where the pivot variables are expressed as linear functions of the nonpivot vari- 
ables. In other words, the pivot columns in the Hermite normal form correspond 
to dependent variables and the nonpivot columns to independent variables. In vector 
language, v = (x!,x 2 , . . . ,x 5 ) solves Equations (A18) if and only if 

(xi,x 2 , . . . ,x 5 ) = (-2x 2 - 3x 4 , x 2 , -4x 4 , x 4 , 0) 

= x 2 (-2, 1,0,0,0) +x 4 (-3,0, -4, 1,0), 

i.e., the solution set of Equations (A16) is the vector space £f(E) consisting of all 
linear combinations of the basis E = {(—2, 1, 0, 0, 0), (—3, 0, —4, 1,0)}. This solu- 
tion set is also known as the kernel of A, denoted ker(A). The nullity of A is the 
dimension of its kernel. In this case, nullity(A) = 2. 

Recall that E = {vi, v 2 , . . . , v„} is a basis of the vector space V if 

1. v = se{E) 

= {«i vi + a 2 v 2 + • • • + a n v n : a, <E K , 1 < i < n} and 

2. E is linearly independent, 

i.e., a\V\ + a2V2 + • • • + a n v„ = 0 if and only if a, = 0, 1 < i < n. All bases of V 
contain the same number of vectors, the dimension of V. The rank of A is the 
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dimension of its row space, i.e., the number of pivot entries in its Hermite normal 
form. For the matrix in Equation (A17), rank(A) = 3. 

Because the rank of a fixed but arbitrary my. n matrix A is the number of pivot 
columns in its Hermite normal form and its nullity is the number of nonpivot 
columns, 

rank(A) + nullity (A) = n. (A19) 

Returning to Equations (A 16), the nonhomogeneous counterpart of Ax = 0 
is Ax = u l , where u — (a,b,c), say. If x is one solution of this equation and y 
is another then, because matrix multiplication is distributive, A(y — x) = 0, 
i.e., v = y — x is a solution of the homogeneous equation. It follows that 
any solution to the nonhomogeneous equation is of the form y = x + v, 
where v € JS?((-2, 1,0,0,0), (-3,0, -4, 1,0)). Written in the form x + 
JS?((— 2, 1,0,0,0), (—3,0, —4, 1,0)), the solution set of Ax = u l is sometimes 
called a coset. (A standard decoding array [Section 6.2] is simply a list that associ- 
ates with each syndrome u a minimum-weight binary word from the corresponding 
coset.) 

If v = {a\, 02, ■ ■ ■ , a n ) and w = (b l ,b 2 , ■ ■ ■ , b„), their scalar (or dot) product is 

v • w = a\b\ + a-ibt + • • • + a n b n . (A20) 

(In the analog for the complex field C,£>, would be replaced by its complex 
conjugate /?,.) If K = U, then 

v • w =|| v|||| w || cos(9), 

where || v ||= (v • v) 1 ^ 2 is the magnitude of v, and 6 is the angle between v and w. 
In particular, v ■ w = 0 if and only if cos(6) = 0, if and only if v and w are 
perpendicular. 

If K is the Boolean field F = {0, 1}, then v • v is the parity of v, i.e., it is 0 if an 
even number of coordinates of v are ones, and 1 if an odd number of components 
are ones. Regardless of the choice of K, v and w are said to be orthogonal if 
v • w = 0. 

If W is a subspace of V, then 

W L = {v G V : v ■ w = 0 for all w G W}. 

If K = U, then W L is called the orthogonal complement of W. If K = F, it is the 
dual of the linear code W. In either case, if Wis the row space of an n x m matrix A, 
then W L is the kernel of A. 

If A = (fly) is an n x n matrix, its determinant is an when n = 1. Otherwise, it is 



det(A) = ^(-l) i+ ^det(A y ), 



(A21) 
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where A y - is the [n ~ 1) -square submatrix of A obtained by deleting its ;'th row and 
y'th column, and the summation is over either i or j going from 1 to n. The classical 
adjoint, or adjugate, of A is the matrix A* whose (i'j')-entry is (— det(A J( ). It 
follows from Equation (A21) that 

AA f = det(A)/„, (A22) 

where /„ = (S y ) is the w-square identity matrix whose (ij)-entry 8 y - = 1 if i = j, and 
0 otherwise. It follows from Equation (A22) that A is invertible if and only if 
det(A) ^ 0, in which case A" 1 = [1/ det(A)]At. 

Let A be an n x n matrix. Then a number X <G K is an eigenvalue of A if there 
exists a nonzero column vector v such that Av = Xv, in which case v is an 
eigenvector of A afforded by X. Thus, 0 =^ v is an eigenvector of A afforded by X 
if and only if (XI n — A)v = 0, if and only if A,/„ — A is a singular matrix, if and only 
if det(XI n - A) = 0. 

The characteristic polynomial of A is 

= det(x/„ — A) 

= /- Cl /-'+ C2 /- 2 + (-l)"c„. (A23) 

If A is an n x w matrix over K, then A, is an eigenvalue of A if and only if X S A" and 
p(A) = 0. The characteristic roots of A are the zeros of (possibly over an 
extension field of K). If r\, . . . , r„ are the characteristic roots of A, multiplicities 
included, then 

n 

c„ = Yl r i 

= det(A), (A24) 

and 

n 

n 

= J2 a » 

!=1 

= tr(A), (A25) 

the trace of A. More generally, c t = E t (r\, r2, . . . , r„), the rth elementary symmetric 
function of the characteristic roots. 

Of special interest is the case in which all of the characteristic roots belong to the 
scalar field K. (This will always be the case when K = C.) The square matrix 
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A = (fly) is symmetric if ay = aji for all i and j, i.e., if A = A 1 . It is shown in 
advanced linear algebra courses that the characteristic roots of real symmetric 
matrices are all real, and that any such matrix is similar (over IR) to a diagonal 
matrix. A real symmetric matrix all of whose characteristic roots are nonnegative 
is said to be positive semidefinite. It turns out that A is positive semidefmite 
symmetric if and only if A = BB l for some real matrix B. 

Suppose Vand Ware vector spaces (over the same scalar field K). A function 
T : V — > W is linear if 

T(au + bv) = aT{u) + bT(v) 

for all a,b G K and all u,v£ V. The connection between linear transformations and 
matrices is via the notion of an ordered basis. If E = {v\,V2, ■ ■ ■ ,v„} and 
F = {w\, W2, ■ ■ • , w m } are ordered bases of V and W, respectively, then, because 
T(vj) G W, there exist (unique) numbers ay G K, 1 < i < m, such that 

m 

T( Vj ) =^fl, 7 wi, l<j<n. (A26) 
i=i 

The matrix representation of T with respect to the bases E and F is \T] F E = (ay). If 
u = c\V\+ C2V2 + • • • + c„v„, then the coordinate representation of u with respect 
to E is [u] E — (ci, C2, . . . , c„)\ the m x 1 column vector whose ;'th entry is c,-. Many 
nice things are known about such representations, e.g., 

[T] F E [u] E - [T(u)} F . (A27) 
We conclude this appendix with a list of useful results. 
A3.1 Theorem. // B is obtained from the n x n matrix A by 

(i) switching two rows, then det(Z?) = — det(A) ; 

(ii) multiplying row s by c, then det(B) = c det(A); 

(iii) adding a multiple of row s to row t ^ s, then det(B) = det(A) . 

A3.2 Theorem. The rank ofanmxn matrix A is the size of the largest square 
submatrix of A whose determinant is nonzero. 

A3.3 Theorem. If A and B are mx n and n x k matrices, respectively, then 
rank(A) > rank(AB). In particular, rank(A) > rank(AA'). 

A3.4 Definition. Suppose A is an n x n matrix. Iff, g G Q tn , denote by A[/|g] 
the t x t matrix whose (/,y')-entry is the (/(/), g(j)) -entry of A. 
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A3.5 Theorem. Suppose r\, r^, . . . , r n are the characteristic roots of the n x n 
matrix A, multiplicities include. Then 

E t (r u r 2 ,...,r n )= det(A[/|/]). 



A3.6 (Cauchy-Binet) Theorem. Suppose A and B are n x n matrices. Let 
C = AB. Then, for all f,h£ Q tn , 

det(C[/|A]) = det (M.f\g}) tet(B\g\h])- 

g&Q;n 
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Hints and Answers to Selected 
Odd-Numbered Exercises 



People are generally better persuaded by reasons they discover for themselves than by 
those which have come from others. 

— Blaise Pascal, Pensees 

CHAPTER 1 

1.1. The Fundamental Counting Principle 

1(c) 7x5x7x5 = 1225. 1(d) 12 x 5 x 12 x 5 = 3600. 
3 2 6 = 64. 

5(b) TOO, OTO, OOT. 

7(a) 60. 7(d) 120. 7(i) 4, 989, 600. 

9 Hint: 5!/(2!3!) = 10. 

11(a) 06101-9936 with a check digit of 5. 

11(b) 97208-9958 with a check digit of 3. 

13 Since the check digit is 2, the last six digits are I I I I I I 

15(b) 121. 15(c) 231. 15(e) 105. 15(f) 270. 

17(a) 1296. 17(b) 360. 

19 Nearly 89 hours. 

21 Hint: Some possibilities are GRITFLUBH, BLUFHGR1T, and BFGRITHLU. 
23(a) o{A) = 3 5 = 243. 
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23(b) Hint: The answer can be expressed as a sum of six multinomial coeffi- 
cients. 

1.2. Pascal's Triangle 

1(a) C(7,4)=35. 1(b) C(10,5) = 252. 1(c) C(12,4)=495. 
1(e) Hint: C(101, 99) = C(101, 2). 
5(a) Hint: Pascal's relation. 
7 Almost 70 billion. 

9(a) Hint: Add fractions, each of which involves lots of factorials. 

9(b) Hint: Consider n-letter words and break the problem into cases according 
to which letter comes last. (If r, = 1, then r, — 1 = 0 but, because 0! = 1, no 
harm is done.) 

11 Hint: (2«) 2 = 2 2 ". 

15(e) Hint: They all have length n = r + s. 

17 C(n,r)C(m,s). 

19(a) F 1 = C(7, 0) + C(6, 1) + C(5, 2) + C(4, 3) = 1 + 6 + 10 + 4. 

19(b) Hint: Pascal's relation. 

19(c) F 1 = 13 + 8. 

21(a) 252. 21(b) 120. 

23 C(30, 2) = 435; C(36, 2) = 630. 

25(c) The third (n = 2) row is 9, 18, 36, 72, 144. 

27 Hint: 63,000 = 2 3 3 2 5 3 7. 



1.3. Elementary Probability 

I T2 = 5- 3(a) 55- 
3(b) ^. 3(d) ^. 
5(a) i 5(b) i. 
5(c) i 5(d) 0. 
7 i 

9 5 x 5 x 5 = 125. 

II [4 x C(13,5)]/C(52,5). 
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13 Yes, with p — \ and q = |, the Chuck-a-Luck probabilities are given by 
Equation (1.5). 

15 P= 1 - (|) 4 = 0.518. 

17 Hint: log 2 (100) = ln(100)/ ln(2) > 6.6. 

19 Hint: P(A and 5) is always the same as P(B and A). 

21 Hint: Use Exercise 20(c). (This is a version of the so-called birthday 
paradox.) 

23(a) f. 23(b) |. 23(c) §. 

25 Hint: What are the chances that one or both drugs are worthless? Compute 
the "placebo" probabilities, (1) that 9 out of 10 snake-bite victims would 
survive without treatment, vs. (2) that 4 out of 4 would survive without 
treatment. 



1.4. Error-Correcting Codes 

1 2 8 = 256. 

3(a) (n,M,d) = (3,4,2). 3(b) (3,8,1). 3(d) (5,10,2). 

5(a) The ASCII code for S is 83. 

5(b) Hint: 83 ten = 0101001 l two . 

5(c) 76 is the ASCII code for L. 

5(d) Hint: 01010101 two = 85 ten . 

5(e) Hint: 1111 101 l two = 251 te „. 5(f) M-A-T-H. 

7(a) Hint: d(b, c) = 1 if and only if c differs from b in a single bit. 

7(b) % ? = { 1 1 1 10000, 00001 1 1 1} is a constant weight (8,2,8) code. 

7(c) Hint: Maximize f(r) = C(8, r). 

9(a) No, n>2d. 9(b) Yes, 2[|J = 6. 

11(a) Hint: If i ^ then d(c h cj) > d. 

11(b) Hint: Show that the "contribution" of the kth column of A to D is 

2z k {M-z k ). 

11(d) Hint: Use parts (a)-(c) to show that \nM 2 > M(M — \)d. 

11(f) Hint: Show that M < n/(2d - n) = [2d/(2d -n)}-\ and observe that 
[[2d/{2d - n)] - 1J < 2[d/(2d - n)J . 
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13 Hints: Use Exercise 12(b) to show that M(n,2r- 1) < M(n + l,2r). To 
prove the reverse inequality, let M = M(n + 1 , 2r) and suppose % is an (n + 1 , 
M, 2r) code. Choose b,c <G ^ so that d(b, c) — 2r. If b and c differ in the ith 
bit, consider the code obtained from by deleting the /th bit from every 
codeword. 

15 (n,M,d) = (15,2048,3). 

17 Hint: Exercise 9. 

19 Hint: 1024 1 / 3 x V2 = 14.25; 6(1 + y/2) = 14.49. 

21(a) Hint: The vocabulary of any (n,M,d) code can be divided into two 
subsets, those words that begin with 0 and those that begin with 1 . 

21(b) Hint: Use part (a) and the Plotkin bound from Exercise 9. 

23 Hint: Why is it enough to show that C(7, 0) + C(7, 1) = 2 3 ? 

25 Hint: Why is it enough to show that N(23, 3) = 2 11 ? 

27(a) Hint: Consider the eight codewords of ,#3 with first bit equal to 0. 

27(b) Hint: Part (a) and Exercise 12(b). 

29(a) Hint: The probabilities follow a binomial distribution. 

29(b) Approximately 0.000194. 



1.5. Combinatorial Identities 

1(a) Hint: 2 + 4 + 6H h2« = 2(l + 2 + 3H \-n). 

3(b) Hint: Gauss. 

5(a) Hint: Theorem 1.5.1. 5(b) Hint: Symmetry. 



9(a) A 5 



/ill 1 1\ 

0 2 6 14 30 

0 0 6 36 150 

0 0 0 24 240 

\0 0 0 0 120/ 



11 Hint: The alternating-sign theorem. [See Exercise 25(h) for a generalization 
of this important result.] 

13 Hint: Chu's theorem. 

15 Hint: Imagine a bowl containing m apples and n oranges. In how many ways 
can r pieces of fruit be chosen from the bowl? 



17(a) 3744. 17(b) 624. 
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19(a) Hint: (n + 1)C(«, r - l)/r = C(n + 1, r). 

19(b) Hint: Make a change of variable in part (a). 

19(c) Hint: Induction using Pascal's relation and parts (a) and (b). 

19(d) Hint: Part (b). 

21(b) Hint: To be consistent, the expressions must differ by n A . 

21(c) g(5) = i (2n 6 - 6m 5 + 5n 4 -n 2 ). 
23 Hint: Exercise 21. 



25(b) C m = 



( 1 


0 


0 


0 




3 


1 


0 


0 


0 


6 


4 


1 


0 


0 


10 


10 


5 


1 


0 


\\5 


20 


15 


6 


1/ 



25(0 C f2 >,= 



/ 


1 


0 


0 


0 






-3 


1 


0 


0 


0 




6 


-4 


1 


0 


0 




-10 


10 


-5 


1 


0 


V 


15 


-20 


15 


-6 


1/ 



25(h) Hint: This result generalizes Exercise 1 1. 
27 Hint: Exercise 15. 



1.6. Four Ways to Choose 

1(a) P(5,3) = 60. 1(b) C(5,3) = 10. 

1(d) P(5,2) = 20. 1(g) 7! = 5040. 

3(a) C(4 + 7-l,4) = 210. 3(b) P(7,4) = 840. 

3(c) C(7,4) = 35. 3(d) 7 4 = 2401. 

5(a) 10,000. 5(b) 715. 

5(c) 210. 5(d) 5040. 

9(a) 2925. 9(c) 286. 

9(d) 816. 

11 Hint: Not all compositions of 6 have three parts. 
13 Hint: Chu's theorem. 
15(a) Hint: Induction on n + k. 
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15(b) Hint: Use part (a). 

15(c) Hint: Use parts (a) and (b). 

17 Hint: If F k > n > F t _i, then 0 < n - F w < F*_ 2 . 

19 Hint: Exercise 19, Section 1.2. 

21(a) C(5, 3) x 2! = 20. 21(b) 5 x C(4, 2) x 1 = 30. 

23(a) Hint: Of the 100 3 possible outcomes allowed under unlimited replace- 
ment, how many are now precluded? 

23(b) 171,600. 23(c) 99,960,300. 23(d) 4,411,275. 

27(a) 286. 27(b) 1,048,576. 

29 Hint: Induction may be easiest; a longer but perhaps more informative proof 
can be based on Exercise 19, Section 1.2. 

1.7. The Binomial and Multinomial Theorems 

1(a) C(5,0) = l. 1(b) C(7,2) = C(7,5) = 21. 

1(d) 2 2 x C(7, 2) = 84. 1(e) 2 5 x C(7, 2) = 672. 

1(f) (-1) 5 x C(9,4) = -126. 1(h) 2 5 x C(4,5) = 0. 
3(b) Hint: M [5] (l, 1, 1) = 3, but M [4jl] (l, 1, 1) = 6. 
5(e) Hint: Set a = b = c = d = e= \\n Example 1.7.8. 
7(a) 3. 7(b) 3 x 9 = 27. 

7(c) Hint: Thirty of the 66 terms were accounted for in parts (a) and (b). 
9 Hint: Consider (w — x + y — zf. 
11(b) Hint: rf = (1 + 1 + • • • + If. 

13(a) M m (x, y, z) = x 6 y 4 + x 6 z 4 + x 4 y 6 + x 4 z 6 + y 6 z 4 + y 4 z 6 . 
13(b) M [5j5] (x, y, z) = x 5 y 5 + x 5 z 5 + y 5 z 5 . 

15 The C(10 + 3 — 1, 10) = 66 monomials are grouped into 14 minimal 
symmetric polynomials. 

19 Hint: Exercise 18. 

21(a) Hint: Consider the telescoping series X^=/[(./ + l)^' ~ f +l \- 

23(a) Hint: Exercise 22. 

23(b) Hint: Section 1.2, Exercise 10(a), p. 17. 
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1.8. Partitions 

1(a) [6], [5, 1], [4,2], [3 2 ], [4, l 2 ], [3,2, 1], [2 3 ], [3, l 3 ], [2 2 , l 2 ], [2, l 4 ], and [l 6 ]. 
1(b) [7], [6, 1], [5,2], [4,3], [5, l 2 ], [4,2, 1], [3 2 , 1], and [3,2 2 ]. 
1(c) [l 7 ], [2, l 5 ], [2 2 , l 3 ], [2 3 , 1], [3, l 4 ], [3,2, l 2 ], [3,2 2 ], and [3 2 , 1]. 
3 Hint: p(15) = 176. 
5(c) 30 2 /12 = 75. 

7 [7, 13], [6, 2, l 2 ], [5, 3, l 2 ], [5, 2 2 , 1], [4 2 , l 2 ], [4, 3, 2, 1], and [3 3 , 1]. 

9 Hint: Let k = [y/n\ . If S is a fixed but arbitrary subset of { 1 , 2, . . . , k}, denote 
the sum of its elements by XX^)- Let n(S) be the (o(S) + l)-part partition of n, 
whose largest part is Tii = n — ^2(S) > k (when k > 3), and whose remaining 
parts (if S ^ 0) are the elements of S. Show that S — > n(S) is a one-to-one 
function. 

11(a) The three odd-part partitions of 5 are [5], [3, l 2 ], and [l 5 ]; the three 
partitions of 5 having distinct parts are [5], [4, 1], and [3,2]. 

11(b) From the answer to Exercise 1(a), the four odd-part partitions of 6 are 
[5, 1], [3 2 ], [3, l 3 ], and [l 6 ]; the four partitions having distinct parts are 
[6], [5,1], [4,2], and [3,2,1]. 

13(a) Hint: Theorem 1.8.7. 

13(b) Hint: Let n h n. If £(n) < m, consider the partition of n + m whose 
Ferrers diagram is obtained from F(n) by adjoining a new first column 
containing m boxes. 

15(a) Because 6 + 4 = 4 + 3 + 2+1, both [6,4] and [4,3,2,1] are partitions of 
(the same n —) 10. With that (subtle!) preliminary calculation out of the 
way, it remains to observe that 6 > 4 and 6 + 4= 10>7 = 4 + 3. 

15(d) Hint: First show that a majorizes (3 if and only if can be obtained 
from F(a) by moving boxes down, i.e., to higher numbered rows. 

17 Hint: Let n be a self-conjugate partition of n. Suppose ^(71) has k boxes on 
its main diagonal. Consider the fc-part partition of n whose z'th part is equal to 
the number of boxes in row and column i of F(n). 

19 p 5 (10)=7. 

21(a) (8,i°i)= 9 °- 21(C) (3,3,2,2) = 25 < 20 °- 

23(a) p(x,y,z) = 5M [2] (x,y,z) - M [V] {x,y,z). 
23(b) p(x,y,z) = 2M m (x,y,z) - 3M [2] {x,y,z) + 4M [1 , ] (x,y,z). 
25(a) H 2 (x, y)=x 2 +y 2 + xy. 
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25(b) H 3 (x, y) = x 3 + y 3 + x 2 y + xy 2 . 

25(c) H 2 (a, b, c) = a 2 + b 2 + c 2 + ab + ac + be. 

25(d) H 3 (a, b, c) = a 3 + b 3 + c 3 + a 2 b + a 2 c + ab 2 + ac 2 + b 2 c + be 2 + abc. 

27(b) Hint: These partitions may sum to anything from n = 1 to n = rs. 

1.9. Elementary Symmetric Functions 

1 Hint: (x + l) 2 is a factor of f(x). 
3(b) E x = -5, E 2 = 6, E 3 = -2, and E 4 = l. 
3(c) Ei = -5, E 2 = -6, E 3 = -2, and E 4 = -1. 
3(d) Ei = -5, E 2 = -6, E 3 = -2, and E 4 = -1. 
3(f) Ei = -1, £ 4 = -2, and E 2 = E 3 = E 5 = 0. 
5(a) Hint: Row 4 of the elementary triangle. 
7 Hint: For/(x) to have degree n, bo 7^ 0. 
9(c) M 3 = E 3 ~3E l E 2 + 3E 3 . 

11(a) x 3 y + xy 3 = \ (M 2 M 2 - M\). 11(b) x 3 v + xy 3 = E 2 E 2 - 2E\. 

13 Hint: If p(x) = {x- ai)(x - a 2 ) ■ ■ ■ (x - a n ), then (-l)"p(l) = (a x - l)x 
(a 2 - 1) • • • («„ - 1). 

15 Hint: Exercise 14. 

17(a) Hint: x< m ' = x(x - 1) • • • (x - [m - 1]) = x(x - 1) • • • (x - m + 1). 
17(b) Hint: Induction using Pascal's relation. 
19(a) 



a [6, l 2 ] 


[5,2, 1] 


[4,3,1] 


[4,2 2 ] 


[3 2 ,2] 


ff 2 (a) 51 


47 


45 


44 


43 



23(a) Hint: Induction and Pascal's relation. 

23(b) Hint: Newton's identities together with Equations (1.35) and (1.36). 

25 Hint: If Xi,X 2 ,...,X n are the characteristic roots of A, then c, = 

E,(Xi,X 2 , ■ ■ ■ ,K) and tr(A') = M,(X U X 2 , . . . ,X n ). 



1.10. Combinatorial Algorithms 

1 1. For 1= 1 to 100. 

2 . Write J. 

3 . Next I. 
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3 Because r\\/r\\ is not computed, the r's should be arranged so that r\ is the 
largest. 



5(a) 


Hint: 


Example 1.10.8. 


7(a) 


Hint: 


Example 1.10.9. 


11 fa) 


1 . 


Tnnnf y n y-~ v> 




z. . 


IL3 — u 




o . 


Pnr T — 1 +- r\ ZL 
r UI J. — XLUH:. 




A 

a 


Pnr ,T= T +1 f n c: 




5. 


For JT= J + l to 6 . 




6. 


£ 3 = £ 3 + x x Xj Xjr. 




7. 


Next K. 




8. 


Next J. 




9. 


Next J. 




10. 


Return E 3 . 




i 

X > 


Tnnnf v -, v ^ 

X 1 1 jJ U.L A J ^ A2 / ••■/ Ag. 




2 . 


P = 1 and E r = 0 

J_ _1_ OL _L ± \_A J_J K V> ■ 




~j * 






4. 


P = P x Xj. 




5. 


Next I. 




6. 


For J = 1 to 6 . 




7 . 


E 5 = E 5 +P/x T . 




8. 


Next I. 




9. 


Return E 5 . 


15(a) 


Hint: 


Count in base 2. 



15(b) Hint: Consider the rearrangements of 00001 1 1 1. 

17(a) Interpreting "fifth bit" to mean the fifth bit from the right, the list is 0000, 

0001. 0011, 0010, 0110, 0111, 0101, 0100, 1100, 1101, 1111, 1110, 1010, 
1011, 1001, 1000. 

17(b) Hint: The two words differ (only) in the leading bit. 

17(c) Hint: Induction together with your observation in part (b). 

17(e) Hint: If X C {1, 2, . . . , «}, let w(X) — x\x 2 ■ ■ -x„ be the binary word 
defined by Xi = 1 if and only if i S X. 

19(a) 1. For J = 1 to 100. 

2. X= RND. 

3. If x < 1/2 then write "H" ; 

4. If x > 1/2 then write "T" ; 

5. Next I. 

19(b) About 50. 
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19(d) 1. H=0andT=0. 

2. For 1= 1 to 100. 

3 . X = RND . 

4. If X< 1/2 then write ' ' H' ' and set H= H+ 1. 

5 . If x > 1/2 then write ' ' T' ' and set T= T +1. 

6. Write H ' 'heads and "r "tails". 

19(e) Add a new line to the solution in part (d): 

7. Write "Empirical P(H) =' 'H/100. 

21(a) Hint: = C(12, 6)/2 12 . 

21(b) 1 . C = 0 . 

2 . For 1= 1 to 100. 

3. H = 0andT=0. 

4 . For J= 1 to 12 . 

5 . X = RND . 

6. If X < 1/2 then H = H +1. 

7. If X>l/2 then T= T +1. 

8 . Next J. 

9. If H= Tthen C= C+ 1. 

10. Next I. 

11. Write ' 'Empirical P(6&6) = ' 'C/100. 

23 1. For J= 1 to 100. 

2 . Write 1 +|6 x RNDJ . 

3 . Next J. 

25 Hint: Begin by changing 6 to 12 in the previous solution. 



CHAPTER 2 

2.1. Stirling Numbers of the Second Kind 

1(a) /(3) = 5; f~\A) = {1}. 

Kb) /(3) = 5; r 1 (4)-{2,4}. 

1(e) /(3) doesn't exist; f-\A) = 0. 

1(f) /(3)=4; /" 1 (4) = {1,2,3,4,5}. 

3(a) (1,2), (2,1), (1,3), (3,1), (2,3), (3,2). 

3(b) Hint: There are six of them. 3(c) There are none. 

3(d) Every function in Q m „ is one-to-one. 
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n 


1 


2 


3 


4 


5 


6 


7 


8 9 


5(8, n) 


1 


127 


966 


1701 


1050 


266 


28 


1 


5(9, n) 


1 


255 


3025 


7770 


6951 


2646 


462 


36 1 



7 Hint: Because n is "square free", d and g cannot be equal. Mimic Example 
2.1.21, but compare with Exercise 28, Section 1.2. 

9(a) G 2>3 = {(1,1), (1,2), (1,3), (2,2), (2,3), (3,3)}. 

9(b) G 3;3 = {(1,1,1), (1,1,2), (1,1, 3), (1,2,2), (1,2,3), (1,3,3), (2,2,2), 
(2,2,3), (2, 3, 3), (3, 3, 3)}. 

11(a) Hint: Generalize your solution to Exercise 10(d). 

11(c) Hint: Part (b). 

11(d) Hint: Part (c) and Exercise 10(a). 

13 Hint: In any (n + l)-part partition of {1, 2, m, m + 1}, the number m + 1 
will belong to a block of size t + 1, were 0 < t < m — n. There are C(m,t) 
ways to choose the companions of m + 1 and S(m — f , n) ways to partition the 
remaining m — t numbers among the remaining n blocks. 

15 (2, 0, 2), (2, 0, 5), (2, 1, 2), (2, 1, 3), (4, 0, 5), (4, 1,5), (7, 0, 7), (8, 0, 5), 
(8,1,8). 

17 Hint: The sum of your answers to parts (a) and (b) should be 
5(5, 1) + • • • + 5(5, 5), the (total) number of partitions of 5. 

19 Hint: Exercise 18(d). 

21 n[C{n - 1, 0) + C{n - 1, 1) + • • • + C{n - l,m - 2)]. 

25 1. For M= 1 to 12. 

2 . S (M, 1) = 1. 

3 . S(M,M) = 1. 

4. Next M. 

5. For M= 3 to 12. 

6 . For N = 2 to M- 1 . 

7. S(JK, ff)=S(fl-l,N-l)+NxS(M-l,N). 

8. Next W. 

9. Next Af. 

2.2. Bells, Balls, and Urns 
1(a) Hint: 

x (5) = x(x - l)(x - 2)(x - 3)(x - 4) 
= x(x 2 -3x + 2)(x 2 -lx+ 12) 

= x(x 4 - [7 + 3]x 3 + [12 + 3x7 + 2]x 2 - [3 x 12 + 2 x l]x + 24). 
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1(b) Hint: 

^2 5(5, r)x^ = x< 5 > + 10x< 4 > + 25x< 3 > + 15x< 2 > + x 

r=l 

= [x 5 - 10x 4 + 35x 3 - 50x 2 + 24x] + • • • + x. 

1(c) When m = 5 and r = 3, 3!5(5, 3) = 6 x 25 = 150 = 3 - 96 + 243 = 
C(3, 1) x l 5 - C(3, 2) x 2 5 + C(3, 3) x 3 5 . 

1(d) Whenm = 5 and n = 3, 3!5(5,3) = 3 x 20 + 3 x 30 = 3( 3 j , )+ 3( 2 5 21 ). 



r\S(5,r) 1 30 150 240 120 

7 Hint: B 1 = 877. 

11(a) 2!5(6, 2) = 2 x 3 1 = 62 = 12 + 30 + 20 = 2( 5 6 , ) + 2( 4 ^ ) + ( 3 6 3 ) . 

13 Hint: because the falcons are identical, all that matters is the number of 
falcons that each brother receives. 

15 Hint: Explain the connection with w-part compositions of m. 

17 P3(5) = 2. 

21(a) 10. 21(b) 5(6,4) = 65. 

21(c) 4!S(6,4) = 1560. 21(d) p 4 (6)=2. 

23 p 4 (10) = 9. 

25(a) S(9, 1) + S(9, 2) + • • • + 5(9, 5) = 1 + 255 + • • • + 6951 = 18, 002. 

25(b) Pl (9) + P2 (9) + ■ ■ ■ + P5 (9) = 1 + 4+ ••• + 5 = 23. 

25(c) C(5 + 9- 1,9) =715. 25(d) 5 9 = 1,953, 125. 

27(c) Hint: Parts (a) and (b). 

29 Multinomial coefficient ( '" ) . 

31 Hint: In Exercise 30(b), 203 = B 6 . 

33 Hint: Set r = m and t = n in Stirling's identity to obtain 

m 

m! = ^(-l) m+,, C(m,«)M m . 

n=\ 

If p is an odd prime, replace m with (the even integer) p — I and use 
n P- 1 = l(mod p). 
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35(a) Hint: The students are labeled. 
35(b) C(10,4) x C(6,3) = (™ 3 ) = 4200. 

2.3. The Principle of Inclusion and Exclusion 

I The nine derangements are (2,1,4,3), (2,3,4,1), (2,4,1,3), (3,1,4,2), (3,4,1,2), 
(3,4,2,1), (4,1,2,3), (4,3,1,2), and (4,3,2,1). 

3(a) C(6,2)D(4) = 135. 3(b) 40. 

3(d) No permutation in S n has exactly n — 1 fixed points. 

5 Hint: Find a derangement that is its own inverse. 

7 C(15, 5)D(10) = 4, 008, 887, 883; 15!/(5!e) = 4, 008, 887, 640. 

9 A total of 31 students have taken trigonometry. 

II Hint: If A k = {p G S 8 : p(2k) = 2k}, 1 < k < 4, then g £ S s deranges the 
even integers if and only if g £ A\ U A 2 U A 3 U A 4 . Use the principle of 
inclusion and exclusion. 

13(a) Hint: If p e S n is a derangement, then p(n) — k ^ n. Consider the two 
cases p(k) = n and p(k) ^ n. 

13(b) Hint: Use part (a) together with an induction hypothesis of the form 
(n - l)D(n - 2) = D(n - 1) + (-1)". 

13(c) Hint: part (b). 

15(a) Hint: Choose 30 times from {A, B, C,D}, with replacement, where order 
doesn't matter. 

15(b) Hint: Example 1.6.14. 15(c) 1540. 

15(d) Hint: Let A\ be the set of nonnegative integer solutions to 
a + b + c + d= 30 in which a > 11, A2 be those solutions in which 
b > 11, and so on. Use PIE. 

17(a) There are three rearrangements of the partition [5 2 ,2], six of [5,4,3], and 
only one of [4 3 ]. 

19 Hint: Mimic the approach of Exercise 18. 

21(d) Hint: First compute cp (/?*), where p is a prime and A: is a positive integer. 
Then consider the case in which m — p k , where p is a prime that is not a 
factor of n. 

25(b) Consider p = , 12, . . . , i n +\) S S n+ \, where p(t) = i, = n + 1. If p has k 
inversions, how many inversions does 

g = (p(l), . . . ,p(t - l),p(t + 1), . . . ,p{n + 1)) G S„ 

have? 
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27(a) Hint: Why is this the same as asking for the probability that a permuta- 
tion, randomly chosen from S15, is a derangement? 

29 Hint: How is this different from listing the m\ different "words" that can be 
produced by rearranging the "letters" of 12 ... m? 



2.4. Disjoint Cycles 

1(a) (126) (345) (7). 1(b) (17) (26) (35) (4). 
1(c) (17) (26) (345). 1(f) (13579) (24) (68). 
3(a) (2,3,1,5,4,7,6). 3(b) (3,4,5,6,1,2). 
3(c) (3,4,1,6,5,2). 3(d) (2,1,3,4,5). 

5 Hint: p(g(x)) = x if and only if x follows y in C p (x) whenever y follows x in 

C g (x). 

7 S 4 = {(1)(2)(3)(4), (12)(3)(4), (13) (2) (4), (14)(2)(3), (1)(23)(4), (1)(24)(3), 
(1)(2)(34), (12)(34), (13)(24), (14) (23), (1)(234), (1)(243), (134)(2), 
(143) (2), (124)(3), (142) (3), (123)(4), (132)(4), (1234), (1243), 
(1324), (1342), (1423), (1432)}. 



Type 


[5] 


M] 


[3,2] 


P, I 2 ] 


[2M] [2,1 3 ] [l 5 ] 


Number 


24 


30 


20 


20 


15 10 1 



11(a) Hint: [P(12, 3)/3] [P(9, 3)/3][P(6, 3)/3] [P(3, 3)/3]/4! = 12!/[3 4 4!]. 
11(b) Hint: [P(12,4)/4][P(8,4)/4][P(4,4)/4]/3! = 12!/[4 3 3!]. 
13 A total of C(m, 2) transpositions belong to S m . 
15 s(7,2) = 1764. 

17 Hint: Interchange m and n in the proof of Theorem 1.8.7 on p. 78. 

19(a) Hint: If the cycle type of p is [m km , . . . , 3 h , 2 kl , l kl ], then 
c tip) = k t , 1 < t < m. 

19(b) Hint: See the hints to Exercises ll(a)-(b). 



2.5. Stirling Numbers of the First Kind 

1 (12)(34), (13)(24), (14)(23), (1)(234), (1)(243), (134) (2), (143)(2), (124)(3), 
(142) (3), (123) (4), and (132) (4). 
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n = 


2 




3 


4 


5 


6 ... 


m = 8 


13,068 


13, 


132 


6,769 


1,960 


322 ... 


m = 9 


109,584 


118, 


124 


67,284 


22, 449 


4,536 ... 



7 Hint: If p <E S m has m — 1 cycles in its disjoint cycle factorization, how many 
fixed points does p have? 

9(a) Hint: Example 1.9.5. 

11(b) Hint: Set x = m = n in Equations (2.33)-(2.34). 

15 Hint: = x ■ {x - l) (m) . 

17 Hint: Compare with Exercise 11, Section 1.5. 

19 Hint: Bell numbers are sums of Stirling numbers of the second kind. 

21(b) Hint: The first odd composite integer is 9. 

27 1. s(l,l) = 1, s(2,l) =1, ands(2,2) =1. 

2 . For m= 3 to 10. 

3. s(m, 1) = (m-1) x s(m -1, 1) and s(m, m) = 1 . 

4. For n = 2 to m - 1 . 

5. s (m,n) = s(m - 1, + x s(m - 1, n). 

6 . Next n . 

7 . Next m. 

29(a) Hint: Exercise 15. 
29(b) 



/ 1 


0 


0 


0 






(\ 


0 


0 


0 


0\ 




/ 1 


0 


0 


0 


o\ 


1 


1 


0 


0 


0 




0 


1 


0 


0 


0 




i 


i 


0 


0 


0 


2 


3 


1 


0 


0 




0 


1 


1 


0 


0 




i 


2 


1 


0 


0 


6 


11 


6 


1 


0 




0 


2 


3 


1 


0 




i 


3 


3 


1 


0 


\24 


50 


35 


10 


1/ 




\o 


6 


11 


6 


iy 






4 


6 


4 


1/ 



31(a) Hint: Exercise 13, Section 2.1. 
31(b) 



/I 


0 


0 


0 




/l 


0 


0 


0 


o\ 




/ 1 


0 


0 


0 


o\ 


1 


1 


0 


0 


0 


1 


1 


0 


0 


0 




0 


1 


0 


0 


0 


1 


3 


1 


0 


0 = 


= 1 


2 


1 


0 


0 




0 


1 


1 


0 


0 


1 


7 


6 


1 


0 


1 


3 


3 


1 


0 




0 


1 


3 


1 


0 




15 


25 


10 


1/ 


\1 


4 


6 


4 


1/ 




\o 


1 


7 


6 


1/ 
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CHAPTER 3 



3.1. Function Composition 

1(a) fog= (3,1,5,2,2). 

1(c) foh= (1,1,1,1,1). 

1(e) g o h = (4,5,4,5,5). 

l(m) fof= (1,2,1,1,5) 

3(a) fg= (12345). 

3(e) gft= (12)(34)(5). 

3(n) ^ = ^5- 

3(r) g- 1 = (15243). 

5(a) f(x) =\{3x 2 - 13x+ 16). 



Kb) go/ =(4,1,4,5,2). 

1(d) ft o/= (1,3,1,1,3). 

Kg) fo go h =(3,5,3,5,5). 

l(n) go g= (2,4,2,1,1). 

3(b) gf= (13542). 
3(m) #=(1)(235)(4). 

3(q) r 1 = (1)(235)(4). 

3(t) fV= (15432). 

5(b) /= (3,1,2). 



5(c) /=(132). 

7 Hint: Let f = p and g — p. Then /g S {/?} if and only if = /?. 

9 Hint: To prove the "curious fact" for row/, suppose fg = fh and use the fact 
that/- 1 eS m . 

11(a) 





e 4 


(12)(3)(4) 


(1)(2)(34) 


(12)(34) 






(12)(3)(4) 


(1)(2)(34) 


(12)(34) 


(12)(3)(4) 


(12)(3)(4) 




(12)(34) 


(1)(2)(34) 


(1)(2)(34) 


(1)(2)(34) 


(12)(34) 




(12)(3)(4) 


(12)(34) 


(12)(34) 


(1)(2)(34) 


(12)(3)(4) 


e 4 



13(a) Because [(12) (3)] o [(13) (2)] = (132) G, G is not closed. 
13(b) Hint: Construct a Cayley table. 

13(c) Because (12345)(13245) = (14)(25)(3) £ S, S is not a subgroup. 

15 Note: If k is the smallest positive integer such that p k+1 e {p,p 2 , ■ 
then p k = e m . 



,P k h 



17 Hint: This is a big job, in part because the Cayley table for A\ is not 
symmetric. Work carefully. Save your work for future reference. 

19 Hint: Using associativity, compute gfh in two different ways. 



3.2. Permutation Groups 

1(a) o(p) = 12. 1(b) o(p) = 15. 
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1(c) o(p) = 2. 1(d) o(p) = 3. 

3(a) (1234), (13) (24), (1432), e m , (1234), (13) (24), (1432), e m , (1234), (13) (24). 
3(b) (12345), (13524), (14253), (15432), e m , (12345), (13524), (14253), (15432), 

3(d) (12345678), (1357) (2468), (14725836), (15) (26) (37) (48), (16385274), 
(1753) (2864), (18765432), e m , (12345678), (1357) (2468). 

5 Hint: G is cyclic if and only if G has a generator, i.e., a permutation peG 
such that o{p) = o{G). 

7(a) The generators of G are (1234) and (1432). 

7(b) The generators of G are (12345), (13524), (14253), and (15432). 

7(c) The generators of G are p r , 1 < r < 4. (Is this G the same as the group in part 
(b)7) 

7(d) The generators of G are p and p~ l = p 5 . 

7(f) Hint: G has four generators. 

9(a) 





e 4 


(1234) 


(1432) 


(13) 


(24) 


(12)(34) 


(13)(24) 


(14)(23) 


e 4 


e 4 


(1234) 


(1432) 


(13) 


(24) 


(12)(34) 


(13)(24) 


(14) (23) 


(1234) 


(1234) 


(13)(24) 




(14)(23) 


(12)(34) 


(13) 


(1432) 


(24) 


(1432) 


(1432) 




(13)(24) 


(12)(34) 


(14)(23) 


(24) 


(1234) 


(13) 


(13) 


(13) 


(12)(34) 


(14) (23) 




(13) (24) 


(1234) 


(24) 


(1432) 


(24) 


(24) 


(14)(23) 


(12)(34) 


(13)(24) 


e 4 


(1432) 


(13) 


(1234) 


(12) (34) 


(12)(34) 


(24) 


(13) 


(1432) 


(1234) 




(14)(23) 


(13)(24) 


(13)(24) 


(13)(24) 


(1432) 


(1234) 


(24) 


(13) 


(14) (23) 




(12)(34) 


(14)(23) 


(14)(23) 


(13) 


(24) 


(1234) 


(1432) 


(13)(24) 


(12)(34) 


e 4 



9(b) G 3 = { e4 ,(24)}. 9(c) G 4 = {e 4 ,(13)}. 

9(d) (1432) and (14) (23). 

9(f) G has seven different cyclic subgroups. 

11 Hint: (p n )~ is the unique permutation /such that fp" = e m = p"f. Show that 
/ = {p~ l ) n solves these equations. Use associativity. 

13 Hint: Exercise 11. 

15 It is false. One counterexample is p = (1234). 

17 (12345) = (15)(14)(13)(12) = (12) (23) (34) (45). 

19 Hint: Sets A and B are equal if and only if A C B and B C A. 

21 The only idempotent permutation in S m is e m ; the cycle type of e m is [l m ]. 
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23(a) Hint: even + even = even. 

23(b) Hint: e m = (12) (12). 23(d) Hint: Part (c). 

25 Hint: (p) is one of the subgroups of S m that contains p. 



3.3. Burnside's Lemma 

1(a) Oi = {1,4}. 1(b) 0 2 = {2,3}. 

3(a) Iff = (12)(34), g = (13)(24), and h = (14) (23), then 



e 4 (l) = 


1, 


/(i) 


= 2, 


8(1) 


= 3, 


h{\) 


= 4; 


/(2) = 


1, 


e 4 (2) 


= 2, 


h(2) 


-3, 


8(2) 


= 4; 


*(3) = 


1, 


h(3) 


= 2, 


«4(3) 


-3, 


/(3) 


= 4; 


h(4) = 


1, 


*(4) 


= 2, 


m 


= 3, 


e 4 (4) 


= 4. 


0 + 0 + 


2-1 


-2 + 0 + 


0 + 0] 


= i. 









3(c) G is not doubly transitive, e.g., no p G G maps 1 to 2 awd 2 to 4. 
Alternatively, 1 [16 + 0 + 0 + 4 + 4 + 0 + 0 + 0] = 3> 2. 

5(a) 1 [5 + 0 + 2 + 3 + 2 + 0] =2. 

5(b) i [6 +1 + 3 + 4 + 3+1] = 3. 

5(c) i[4 + 0 + 0 + 0] = 1. 

5(d) I[8 + 4 + 4 + 4] =5. 

7(b) i[16 + 8 x 1 + 3 x 0] = 2. 

7(c) Hint: i [64 + 8 x 1 + 3 x 0] = 6. 

9(a) Hint: Show that o(O x ) = o(O y ). 

9(b) Hint: If p(l) = x and q{\) = y, then qp~ l (x) = y. 

11(a) Hint: Example 3.3.17. 

11(b) Hint: Exercise 9, Section 2.4. 

13 It's off by about 0.03. 

15 Hint: Mimic the proof of Theorem 3.3.18. 

17 Hint: F(e m ) = m > 1. 

19 Hint: Exercise 16. 
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3.4. Symmetry Groups 




3(b) The plane symmetries are e?,, (123), and (132). 

5(a) ((12345)) = {e 5 , (12345), (13524), (14253), (15432)}. 

5(b) ((12345)) U {(12)(35), (13) (45), (14) (23), (15)(24), (25) (34)}. 

7 Hint: As in Example 3.4.6. show that half the symmetries are rotations and 
half are reflections. 



9(b) It is the group A 4 from Exercise 7, Section 3.3. 
11 



g 


9 


9 


9 


(16) (25) (34) 


(18) (27) (36) (45) 


(16) (2453) 


(1647) (2835) 


(25) 


(13) (24) (57) (68) 


(15) (26) 


(17) (28) 


(34) 


(12) (34) (56) (78) 


(14) (36) 


(16) (38) 






(16) (2354) 


(1746) (2538) 


(145632) 


(124875) (36) 


(12) (56) 


(35) (46) 


(124653) 


(126873) (45) 


(13) (46) 


(25) (47) 


(153624) 


(18) (243756) 






(132645) 


(156843) (27) 


(24) (35) 


(14) (58) 


(154623) 


(134865) (27) 


(1265) (34) 


(1674) (2583) 


(142635) 


(18) (265734) 


(1364) (25) 


(1764) (2358) 


(135642) 


(137862) (45) 


(23) (45) 


(23) (67) 


(123654) 


(157842) (36) 


(1562)(34) 


(1476) (2385) 


(16) 


(15) (26) (37) (48) 


(1463) (25) 


(1467) (2853) 



13 Hint: It shouldn't be necessary to start over from scratch. 

17(a) Hint: Each face is incident with five vertices, but 12 x 5 is not the number 
of vertices; it is too large by a factor of 3. (Why?) 

19 Hint: What angle do two adjacent sides of the hexagon make? 

21(a) 6 + 8=12 + 2. 21(b) 4 + 4 = 6 + 2. 



3.5. Color Patterns 

1(a) g=(y,b,w, r). 1(b) g=(b,r,y,w). 

1(c) P={{r, r, w, b), (w, r, b, r), (b, w, r, r), (r, b, r,w)}. 
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1(d) Hint: o(P) = 8. 1(e) 70. 
1(f) 55. (Don't forget the 1-cycles.) 



3(b) * 



\ / \ / 



\ / 



\ / 



and the four colorings obtained by interchanging the x's and y's. 
3(d) 208. 

3(e) Hint: The number of patterns is an integer. 
3(f) Hint: Part (e). 

3(g) Hint: If q G S m is a p-cycle, then q' is a p-cycle, 1 < i < p. 
5 Hint: Exercise 5(b), Section 3.4. 



7(a) 





r 




b 


r 


w 




w 




b 



9(a) Hint: Exercise 10, Section 3.4. 

9(b) C(4 + 3- 1,4) = 15. 

11 Hint: Exercise 10. 

13(a) Hint: Figure 3.4.7. Answer: 23. 

13(b) 333. 13(c) 4,173,775. 

15 i(« 8 +« 4 + 2n 2 + 4«). 

17 4,783,131. 

21 Hint: p = q if and only if / =f(q~ 1 p), for all / G C m> „, and p = q, if and 
only if = e m . 



3.6. Polya's Theorem 

1(a) Hint: Eliminate all colorings with a white vertex from Fig. 3.6.2. 
1(b) W G (r, b) =\[M\ + M\ + 2M 4 ] . 
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3(b) Hint: Using part (a), show that W G (r, w, b) = (r 3 + w 3 + b 3 )+ 
(r 2 w + r 2 b + rw 2 + rb 2 + w 2 b + wb 2 ) + 2rwb. 

3(c) Hint: Recall that a system of distinct representatives consists of one 
coloring from each of the 11 color patterns. In particular, more than one 
correct answer is possible. 

5(b) Hint: Compare with Exercise 3(c), Section 3.5. 



b ww b b b w ww b b w 

\/\/\/\/\/\/ 
b w w b w w b b b w w b 

7(a) W G (r, w, b) =M [6] + M [5jl] + 3M [4;2] + 3M m + 3M [4jl2| + 6M [3>2jl] + 1 1M [23] 



^.fo ^> 

w ww r w r r r r w r 



7(b) 



9 W G (r, w, b) = M [8] + M m + 3M |6 , 2| + 3M [5i3] + 7M |4 2] + 3M [6jl 2] + 7M [5A1] + 
13M [4i3il] + 22M| 4j2 2| + 24M [3 2 )2] . 

11 There are five patterns of weight r 2 w 2 b 2 . 

13 W c (r, w, b,y) = M [4] + M [3>1] + M [2 2] + M [2>1 2] + 2M [14] . 

15(a) 1. 15(b) 1. 15(c) 2. 

17 H 5,5,5 ) - 252,252. 

19(a) 3. 19(b) 2. 19(c) Hexagon. 

19(d) No, it is much easier simply to exhibit all possible inequivalent "color" 
patterns. 



3.7. The Cycle Index Polynomial 

I Z 3 =£0? + 3s ^2 + 2s 3 ). 

3 i(« 3 + 3« 2 + 2n) = (n + 2){n+ l)n/6 = C{3 + n- 1,3). 
5(a) Hint: Figure 3.4.5. 5(b) Hint: Figure 3.4.7. 

II Hint: Exercise 8. 

13 Hint: Use "0" to represent "10" in the disjoint cycle factorization of 
p G sf^ C Siq. Use Exercise 8 and mimic Example 3.7.15. 

19 Hint: Exercise 18 in this section and Exercise 17(b) in Section 2.5. (Compare 
with Equation (2.6) in Section 2.2.) 
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23(b) Hint: For matrix L3, the diagonal product J\ corresponding to permuta- 
tion p G S3 is given in the following table. Show that per(Z/;) = Y\ P = 
6Z 3 (M U M 2 ,M 3 ). Use Theorem 3.7.8(a). 



p 


e 3 


(12) 


(13) 


(23) 


(123) 


(132) 


n P 


M\ 


M1M2 


0 


2M,M 2 


2M 3 


0 



25(b) H 5 _ 3 (x,y,z) = H 2 (x 7 y,z) = M [2] (x,y,z) + M [l2] (x,y,z), so ff 5 _ 3 (l,2,3) = 
[l 2 + 2 2 + 3 2 ] + [1 x 2 + 1 x 3 + 2 x 3] = 14 + 11 = 25 = 5(5,3). 



CHAPTER 4 

4.1. Difference Sequences 

1(a) a mi = 1492. 1(b) a 497 = 1066. 1(c) a 491 = 2004. 

3(c) 3 4 9 18 31 48 69 94 123 ••• 
1 5 9 13 17 21 25 29 ••• 
4 4 4 4 4 4 4 ••• 

3(d) Hint: Equation (4.10). 
3(e) b n = \ {An 3 - 9n 2 + 23« + 6) . 

5(a) 1 2 4 8 16 32 ••• 
1 2 4 8 16 32 ••• 
1 2 4 8 16 32 ••• 

7 Hint: Mimic Gauss's approach to summing the first n positive integers. 
9 Hint: Chu's theorem. 
11(c) Hint: Exercise 3(a). 

11(d) C(9, 1) x 3 + C(9, 2) x 1 + C(9, 3) x 4 = 399. 
11(e) Hint: Given that 1131 = 2« 2 - n + 3, what is n? 
11(f) The sum is 652,050. 

13(a) C(k+l, 1) x 0+ C(k+ 1,2) x 1 + C(k+ 1,3) x 2 = 
k{k+ l)(2k+ l)/6. 

17 Hint: By induction, it suffices to show that x m is a linear combination of 
x^/rl, 0<r<m. 

19(c) Hint: p m (n) — p m -\(n — 1) = p m (n — m), 1 < m < n. 
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21(c) f{x) = I [x 3 + 6x 2 + 5x + 6] . 
21(d) /(*) = i [x 3 + 6x 2 + 5x + 6] . 

23(a) Hint: One possibility is induction on n; another involves proving that 
A 4 S(n + 2,n) — 3, n > 0; a third approach counts the partitions of 
{ 1 , 2, . . . , n + 2} into n subsets. 

23(b) Hint: One approach is to use induction on n; another uses part (a). 

25 f(n) = C{n,0) + 5C{n, 1) + 6C(«,2). 

27(a) S(n+ \,n) = 0x C(«,0) + 1 x C(n, 1) + 1 x C(n,2) = C(n+ 1,2). 

27(b) Hint: Recall that S(n + 1 , n) is the number of ways to partition an 
(n + l)-element set into the disjoint union of n nonempty subsets. 

29(a) Hint: Exercise 7. 

29(b) Hint: Show that n = (r - s)(r + s). 

31 Hint: Exercise 29(a). 

33 Hint: Show that any such n is a difference of squares; use Exercise 32. 
4.2. Ordinary Generating Functions 

1 Hint: C(m, n) = 0, n > m. (A closed formula for g(x) — J2n>o C( n i 
where r is a fixed but arbitrary nonnegative integer, can be found in 
Theorem 4.2.11.) 

3(a) g(x) = (1 -x)/{\ -3x-2x 2 ). 

3(b) (2-3*)/(l -2x+3x 2 ). 

5 Hint: Factor 1 - 3x - I0x 2 + 24x 3 . 

7 Hint: This is the Maclaurin series expansion from calculus. 

11(a) g(x) =x 4 + Ax 5 + 10x 6 + 16x 7 + 19x 8 + 16x 9 + 10x 10 + 4x n + x 12 . 

11(b) Hint: From part (a), the coefficient of x 1 in g(x) is a 7 = 16. Show that 12 
compositions of 7 having 4 parts, none of which is larger than 3, can be 
obtained by rearranging the parts of the partition [3,2, l 2 ], and that the 
remaining 4 come from rearranging the parts of [2 3 , 1], 

11(d) Using your answer to part (a), show that a 7 = 16 = ag. 

13 Hint: This gives an independent proof that Example 4.2.10 ends with a 
correct solution. 

15 bo = ciq and b n+ \ — Aa n , n > 0. 

19(a) Hint: Corollary 2.2.3. 
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19(d) Hint: Section 1.5, Exercise 11. 

25(a) g(*) = 1/(1 - 3x). 

25(b) Hint: » 3 = C{n, 1) + 6C(n, 2) + 6C(n, 3). 

27(a) a n = n-\ + \/{n+\) = n 2 /{n+\). 

27(b) Hint: Z) c (x/(x)) = £„>o nV. 



4.3. Applications of Generating Functions 

1(a) C(-3,4) = 15. 1(b) C(-4,3) = -20. 

1(c) C(f,2) = -i. 1(d) C(-f,2)=§. 

3 Hint: Example 4.3.2. 

5 Hint: If all else fails, try induction. 

9 Hint: Exercise 6, Section 4.2, and the ratio test. 

13 Hint: Exercise 10. 

15 Hint: Exercises 13-14. 

17(a) g(x) = (l+x + x 2 )(l+x 2 +x 4 )(l+x 3 +x 6 )---{\+x r +x 2r )--- 

19(c) Suppose 7i is a distinct m-part partition of n. What's left when the first 
column is removed from its Ferrers diagram F(n)l 

19(e) Hint: Suppose % = [m, %i, . . . , n m ] h n satisfies %\ > %2 > ■ ■ ■ > n m > 0. 
Define 9(71) = [uj , u 2 , . . . , |X m ] by u ; = Kj — (m — i), 1 < i < m. Show that 
9 is a one-to-one function from the partitions of n having distinct parts and 
length m, onto the m-part partitions of n — C(m, 2). 

21 Hint: x" = J2 r >i S(m,r)x^. 

23 Hint: In Theorem 4.3.5, f r (x) = xf r -i(x)/(l — rx). 

27(a) (l+x + x 2 + ---+x 10 ) 4 ^ [(1 -x n )/(l -x)] 4 . 

27(b) (x + x 3 + x 5 + • • -) 4 = [x/{l - x 2 )] 4 . 

27(c) (x 2 + x 3 + x 4 + x 5 )(x 7 + x 8 + x 9 )(x 4 + x 5 + x 6 H )(1 + x + x 2 H h 

x 6 ) =x 13 (l -x 4 )(l -x 3 )(l -x 7 )/(l -x) 4 . 

27(d) 1 / ( 1 - x) 4 . (See Equation (4.29) .) 

29(a) [x/(l-x)f. 29(b) [x 3 /(l-x)] 8 . 

31(a) g k (x) - [x + x 2 + --.+xf = [(x-x 7 )/(l -x)f. 

31(d) a 4 (20) = 35. 
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35(a) Hint: Show that the coefficient of x" in the product 

(1 + a x x + a\x 2 + ■ ■ -)(1 + a 2 x + a\x 2 H ) • • • (1 + a m x + a 2 m x 2 H ) 

is a sum of terms a" 1 a" 2 ■ ■ ■ a n ™ , one for each of the C(n + m—l,n) 
nonnegative integer solutions to n\ + n-i H V n m = n. 

35(e) Hint: H 3 (a, b, c) = [(a 3 + b 3 + c 3 ) + (a 2 b + a 2 c + ab 2 + ac 2 + b 2 c + 

bc 2 )+ abc],H 2 (a,b,c) = [(a 2 + b 2 + c 2 ) + (ab + ac + bc)],Hi(a,b,c) = 
E\(a,b,c) =(a + b + c),E2(a,b,c) = (ab + ac + be), and E^(a,b,c) = 
abc. 

35(f) Hint: Part (d). 

37 Hint: Make appropriate choices for a 1,02, ■•• ,«m in Exercise 35(d). 

39(f) Hint: Show that the right-hand side obeys the same boundary conditions 
and recursion as the left-hand side, i.e., confirm the analogs of parts (c) and 
(e) for the right-hand side. 

4.4. Exponential Generating Functions 

1(a) On = n2"-\ n > 0. 1(b) a„ = 1 + 3", n > 0. 

1(c) Hint: Equation (4.52). 

3(f) Hint: It is a consequence of the mean value theorem that iff'(x) = g(x) for 
all x in some open interval /, then there exists a constant C such that 

g(x)=f(x) + c,xei. 

3(g) Hint: Part (f). 

5(e) Hint: S(n + 1, r) = S(n, r - 1) + rS(n, r). 
5(g) Hint: B Q = 1 = exp(0). 
7 Hint: Equation (4.59). 

9(a) Hint: Apply the fundamental theorem of calculus to the k = 2 case of 
Equation (4.58). 

11(a) Hint: Equation (4.30a). 

11(c) Hint: Theorem 4.4.5 and Exercise 15, Section 4.2. 

13 Hint: Use Exercise 12, Section 2.3, and obtain a new proof of Theorem 4.4.5. 

17 f(x,y) = 1/(1 -x-xy) 

19 
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21(a) Hint: One approach is this: Let S = {1,2, . . . ,«}. Suppose 
1 = d\ < d2 < ■ ■ ■ < d r = n are the distinct positive divisors of n. Let 
Si = {k e S : GCD(k,n) = di}, 1 < i < r. Show that o(5 ; ) = cp(n/J,). 

21(b) cp(l) + cp(2) + cp(3) + 9(6) =1 + 1+2 + 2. 

21(d) Hint: Part (a) and Corollary 4.4.19. 

21(e) 6li(1)+3u(2) + 2u(3) + h(6) =2. 

23 Hint: Equations (4.62) and (4.66). 

25 Hint: Corollary 4.4.19. 

27(a) Hint: Section 4.3, Exercises 33 and 34. 

27(b) Hint: Section 4.3, Exercises 34 and 35. 

29 Hint: Show that gOXe* - 1) = x. 

31(b) Hint: Exercise 13, Section 2.1, and S(n + 1, r + 1) — S(n, r) = 
(r+ \)S{n,r+ 1). 

31(c) Hint: From part (b), 2s„ = s n + X) r >i r)j„_ r = ^ r >o C(«,r)s„_ r , 
« > 1, so 2g(x) = 1 + e x g(x). 

31(e) Hint: 1 = £*>i 1/2*. 

33 Hint: Exercise 32. 

4.5. Recursive Techniques 
1(b) ^-^.iL^I =5, n> 1. 
1(c) Hint: Exercise 16(b), Section 1.6. 
1(e) lim L„+i/L n = (p. 

3(a) a„ = 3" - 2", n > 0. 3(b) a„ = 3" + 2", n > 0. 

3(c) a n = 5(3") - 3(2"), n > 0. 
5(a) a„ = 5« + 2 + 2", n > 0. 
5(b) a n = {n 2 + n + 1)2", « > 0. 
5(c) a„ = (n 2 + 2«+ 1)2", m > 0. 

7(a) a„ = ± (3n 2 - « + 6), n > 0. 7(b) a„ = (n - l) 2 , « > 0. 
7(c) On = \ (2m 3 + 3« 2 + In + 12), « > 0. 

9(a) a n = (n + 2)3", « > 0. 9(b) «„ = (« + 2)3", n > 0. 

11(a) a„ = 4(3") + 3(-2)" - 2", n > 0. 
11(b) a„ = (-3)" + (2« + 3)2", n > 0. 
11(c) a„ = (-3)" + (2« + 3)2", n > 0. 

13 L„ = (p" +1 + (-l/(p)" +1 , n > 0, where cp = (1 + \/5)/2. (Now can you prove 
your conjecture in Exercise 1(b)?) 
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15 Hint: Suppose a„ = c\a n -\ + C2<2„-2 + h c^a n -k + w(n), n>k, and 

b„ = c\b n -\ + C2fe„-2 + • • • + Ckb n -k + w(n), n > k. Define {d n } (not to be 
confused with a difference sequence) by d n = b n — a n . Show that {d n } satisfies 
the homogeneous recurrence d n — c\d n -\ + C2<i n -2 H V Ckd n -k, n > k. 

17(c) r„ = i(« 2 + « + 2). 
CHAPTER 5 

5.1. The Pigeonhole Principle 

I Hint: Let S = {si, S2, ■ ■ ■ , s n }. Consider the remainders left when si+ 
S2+ ■ • • + s t is divided by n, 1 < t < n. 

3 Hint: the midpoint of the segment joining Pj = {xi,y{) and Pj = (xj,yj) is 
M=(\[x i +x ] ],\\y i +y J ]). 

5 Hint: Consider the average weight of the n objects. 

7 Hint: Factor each element of S as the product of a power of 2 and an odd 
integer. 

9(b) The other three isomorphisms are (c,b,a,d), (b,c,d,a), and (c,b,d,a). 

II Hint: List the four nonisomorphic graphs on three vertices. 
13 Hint: The first theorem of graph theory. 




23(a) 5 x 3 is odd. 

23(b) The largest vertex degree cannot be as large as the number of vertices of 
positive degree. 




29(a) Hint: "A picture is worth a thousand words." 
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5.2. Edge Colorings and Ramsey Theory 

3(a) Hint: Using 7V(2,4) = 4 and /V(3,3)=6, mimic the proof that 
#(3,3) < 6. 

3(b) Hint: Show directly, without using Theorem 5.2.3, that #(4,4) < 
N(3,4) +N(4,3). 

5 Hint: Corollary 5.2.10 or Fig. 5.2.6. 

9(b) f 3 {x) ^i[(l+x) 3 + 3(l+x)(l+x 2 ) + 2(l+x 3 )] = l+x + x 2 +x 3 . 
11 Either the edges are adjacent or they are not. 
13(a) Hint: Draw some pictures. 

13(b) In view of Example 3.7.17, the coefficient of x 3 in f$(x) is ^[455 + 
15(35+ 28) + 40(1 + 4) + 60(1 + 18) + 180(1)+ 144(0)+ 120(1 + 2) + 
40(5) + 120(1)] = 5. 

15 Hint: Examples 3.7.17 and 5.2.9. 

17(a) 



17(b) The complements of the graphs in part (a). 
19 Hint: Consider their complements. 
21(a) Hint: Six colorful pictures should suffice. 
21(c) Hint: Figure 5.2.2. 

5.3. Chromatic Polynomials 

1(a) p(G,x) = x{x - l)(x - 2)(x 2 - 3x + 3). 

1(b) Hint: Show that this graph is isomorphic to the graph in part (a). 

1(c) Hint: Theorem 5.3.1 1. 1(d) Hint: Equation (5.13). 

1(e) p(G,x) =x{x- l)(x-2) 3 . 

3(b) p(G,x) =x(x- l)(x-2)(x-3)(x 2 -4x + 5). 

5(a) Hint: This is a statement about binomial coefficients. 

5(b) Hint: This is a statement about Stirling numbers of the first kind. 

7(a) p(G,x) = x< 5 ' + x< 4 ' =x(x- l)(x - 2)(x - 3) 2 . 

7(b) p(G,x) = x< 6 ' + 3x< 5 ' + 3x< 4 ' +x< 3 ' = x(x - l)(x - 2)(x 3 - 9x 2 +29x-32). 





9(a) 



6- 



-Q 
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9(b) p{C A ,x) =x(x- l)(x 2 -3x + 3); p(C 6 ,x) = x{x - l)(x 4 - 5x 3 + 10x 2 - 
lQx + 5). 

11(b) Hint: Compute /(l). 

13 Hint: Chromatic reduction. 

15 Hint: Theorem 5.3.23. 

17(a) Hints: K s t = K c s V K c t and x m = YTr=\ s ( m > r ) x(r) ■ 

17(b) p{K 2t3 ,x) = (xW +x m) v (xO + 3x< 2 ' +x< 3 )) =x< 2 ' +4x< 3 ' +4x( 4 )+x< 5 ). 
17(c) p(K 3 j,x) = x< 6 ) + 6x< 5 ' + llx( 4 ' + 6x< 3 ' +x< 2 '. 

19(a) Hint: From the answer to Exercise 9(b), p{C^,x) = x(x — l)x 
(x 2 -3jc+3). 

19(b) Hint: Let/(x) = p{K 43 ,x) = x(x - l)(x 5 - llx 4 + 55x 3 - 147x 2 + 
204x- 115). Show that /(l. 7) = -0.58 and/(1.8) = +0.15. 

19(c) Hint: Use the fact that the coefficients of p(G,x) alternate in sign. 

21 Hint: Factor /(x). 

23(a) Hint: How many of the 11 nonisomorphic graphs on 4 vertices are trees? 
23(c) ? 

o — o — o — o — o — o — o ° ° — ° — ° — ° ° 





25 Hint: Explain why this is a restatement of Exercise 7(b). 
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27(a) Hint: One of them can be found in Exercise 1. 

29 Hint: Show by induction on the number of edges that p(G, t) is nonzero with 
sign (— l)"~ c , t G (0, 1), where c is the number of components. 

31(a) Hint: Revisit the proof of Theorem 5.2.5. 

31(b) Hint: Induction on s + t. 

5.4. Planar Graphs 

1 Hint: Lemma 5.3.17. 
3(c) 




5(a) 




7 Hint: K33 is bipartite. 

9 Hint: Show that G has a nonplanar subgraph. 
13(b) The dual of a cube is a regular octahedron. 
17 O O 



-o 



19 Hint: If H A is a graph, then G = H d . Otherwise, let G be a graph obtained 
from H A by subdividing some of its edges. Explain why Gd = H. 



5.5. Matching Polynomials 

3(a) M(K 6 ,x) =x 6 - 15x 4 + 45x 2 - 15. 
3(c) M(P 7 , x) = x 1 - 6x 5 + 10x 3 - Ax. 
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3(e) M(C 7 , x) = x 1 -lx 5 + Ux 3 -lx. 

5 Hint: If G\ and G 2 are isomorphic graphs, prove that there is a one-to-one 



correspondence between the r-matchings of G\ and the r-matchings of G 2 . 

7(b) If e is the "middle" edge of P4, then the 1-matching M = {e} is a maximal 
matching but not a maximum matching. 

9(a) K 3 . 9(b) K U2 . 

9(e) Hint: The clique number, ©(G) = oc(G c ). 
9(g) Kj-e. 

13(a) Both have degree sequence (4, 3 2 , 2 4 , l 2 ). 

13(b) Both have chromatic polynomial x(x — l) 6 (x — 2) 2 . 

13(c) Both have matching polynomial x 9 — 10x 7 + 29x 5 — 25x 3 + 5x. 

15 Hint: The sum of the characteristic roots of the n x n matrix A — (Ay) is the 
trace of A, defined by tr(A) = a a- 

17 Hint: Use the quadratic formula to find squares of roots. Then use a 
calculator. Two-decimal-place accuracy should suffice. 

19(c) Hint: Exercise 14, Section 5.3. 

23 Hint: Like the determinant, the permanent can be expanded by rows or 
columns. For example, if Ay is the matrix obtained from A by deleting row i 
and column j, then 



27 Let Gi = (V,E) and G 2 = (V,F), where V = {1,2, . . . ,«}. Suppose/ G 5„ is 
fixed but arbitrary. Let A(G X ) = (a, y ), A(G 2 ) = (fe.j), and P = P(/) = (5 i/0) ). 
We will prove the equivalent formulation that / is an isomorphism from G\ 
onto G 2 if and only if p- 1 A(G 2 )P = A(Gi). Because the (ij')-entry of P x is 
the (j, /)-entry of P, the (2 j>entry of P _1 A(G 2 )P is 




\<i<n. 



n 




s,t=\ 



Now, = fly, 1 < jj' < n, if and only if F = {{f{i),f(j)} : S £}, 

if and only if / : V — > V is an isomorphism from G\ onto G 2 . 
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5.6. Oriented Graphs 

1(a) All 2 3 = 8 orientations of the tree K\ >3 are acyclic. 
1(c) Hint: Evaluate (-l) 4 x(x - \){x - 2) 2 at x = -1. 



5(a) L(G) 
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5(b) Hint: t(G) = 11. 

7(a) s(K h3 ) = (4,1,1,0). 7(b) = (4,4,2,0). 

9 Hint: Exercise 8. 

11(a) s(C 6 ) = (4, 3,3,1,1,0) majorizes d(C 6 ) = (2, 2, 2, 2, 2, 2) because 4 > 2; 
4 + 3>2 + 2;4 + 3 + 3>2 + 2 + 2; ... ; and 4 + 3 + 3+1 + 1 + 0 = 

11(b) Hint: s(G) = (5,3,3,2, 1,0). 
11(c) Hint: s(G) = (5, 5, 3, 3, 2, 0). 

13 Hint: Show that L(G) + L(G C ) — nl n — J n , where J n is the n x n matrix each 
of whose entries is 1; use the fact from linear algebra that commuting 
symmetric matrices are simultaneously diagonalizable. 

15 Hint: Exercise 13. 

17 Hint d V G 2 = {G\ + G\f . Use Exercise 13. 

19(a) Hint: Because 5(^2,2) = (4,2,2,0), it suffices to show (independently) 
that t(K 2 .2) = [4 x 2 x 2]/4 = 4. 

19(b) Hint: 5(^2,3) = (5,3,2,2,0). 

19(c) Hint: s{K h4 ) = (5, 1, 1, 1, 0). 

21(a) s(P 4 ) = (2 + y/2, 2, 2 - y/l, 0). 

23(a) 5(G) = (4,3,3,1,1,0). 23(b) s(G c ) = (5, 5, 3, 3, 2, 0). 

23(c) s(H) = (5,3,3,2, 1,0). 23(d) ^(// c ) = (5,4,3,3, 1,0). 

23(g) s(G+G c ) = (5,5,4,3,3,3,3,2,1,1,0,0). 
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5.7. Graphic Partitions 

1 Only (3,1) fails to weakly majorize (2.5,1.5,1). 

3(a) The partitions [6 2 ,2 2 ], [5 4 ], [3,2 2 , l 2 ], and [2, l 5 ] are not graphic. 

5(a) C(, and the union, C3 + C3. 




7 Hint: Each graph will have three edges, but the numbers of vertices may 
differ. 

9 Hint: Mimic the argument that led to Inequalities (5.40). 

11 Threshold Algorithm. Suppose x = [xi, X2, . . . , x„] h 2m is a threshold 
partition. 

1. Let V= {1,2 n} and E = 0. 

2 . For i = 1 to f (x). 

3 . For j = i to x±. 

4. £ = £u + 1}}. 

5 . Next j . 

6 . Next i . 

7 . Return G= ( V, E) . 

13(a) Hint: One alternative is to show that det(x/4 — L{G)) — x(x — l)(x — 3) x 
(x — 4). Another is to find eigenvectors for L(G) corresponding to eigen- 
values X = 1, 3, and 4. 

15(a) The combination /(v) = d G (v), v G V(G), and r = 3 will work. 

17(a) Hint: If /(v) = d c (v), v G V(G), is a threshold labeling, then 4 > f > 5. 
(Why?) 

19 Hint: The easiest solution uses the characterization of threshold graphs from 
Exercise 18. Can you find a more revealing solution? 

21 Hint: If you have access to appropriate computer software, work out the 
Laplacian eigenvalues of the Petersen graph from Example 5.1.7. Otherwise, 
look for examples that have n < 6 vertices. 

23 Hint: To show that a graph is split, it suffices to exhibit an appropriate 
partitioning of its vertices. One way to do that is to color the vertices of the 
clique one color and the vertices of the independent set a different color, e.g., 
dark and light. 

25 Hint: The degree sequence of a connected graph on five vertices is a partition 
with five parts, the largest of which is at most 4. 
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CHAPTER 6 

6.1. Linear Codes 

1(a) wt(l 10100010) = 4. 1(b) wt(001011101) = 5. 

3(a) S, itself, is a basis. 3(c) Hint: dim(j§? (S)) = 3. 

5 Hint: JS?(5) is a (4, 2 3 , 2) code. 

7 Hint: Consider S = {1100,0011, 1111}. 

9 Hint: {u e F" : wt(w) = 1} C S. 

11 Hint: If w = xix 2 X3X4 then w ■ 1 100 = 0, if and only if x\ + X2 = 0, if and 
only if x x — x 2 . 

13(a) Hint: Lemma 6.1.18. 

15(b) One solution is { 1 10000, 101010, 100001}. (Your solution should consist of 
three vectors that span the same space.) 

17(b) Hint: Show that C 2 {n,k) = C 2 (n - l,k- 1) + 2*C 2 (w - l,k) by distin- 
guishing two cases according to whether the (k, n) -entry is a pivot entry. 

19 Hint: If u ■ w — 0 and v ■ w = 0, then (au + bv) ■ w = a(u ■ w) + 
b(v ■ w) = 0 for all a,b E F. Conversely, if (au + bv) ■ w = 0 for all 
a,b e then (aw + ^v) • w — 0 when a = 1 and b = 0. 

6.2. Decoding Algorithms 

3(a) Jf 2 is a (3,2,3) code. 3(b) Jf 2 = {000, 111}. 
3(c) G=(l 1 1). 

5(a) Hint: Suppose 1 < k < m. Let S be the set of integers j, between 1 and 
2'" — 1 inclusive, such that the Mi digit in the binary expansion of j is 1. Use 
the fundamental counting principle to show that o(S) = 2 m ~ l . 
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11(a) Hint: s = 00. 11(b) c = 01011. 11(c) c = 01110. 
13(a) c = 0001111. 13(b) c = 0001111. 

13(c) c = 0111100. 
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/0 1 1 1 
15(b) Hint: X 1 = 1 0 1 1 
\1 1 0 1 

17(a) c= 1011010. 17(b) c = 0010110. 

17(c) c = 0010110. 17(d) v is a codeword. 

19 Hint: Exhibit an invertible linear transformation from ,#3 onto F 4 . 

21(a) d = 2. 

21(b) Hint: There are eight such words. 

23 Hint: Suppose s x and s 2 are two syndromes. Let X t = {v £ F" : s, is the 
syndrome of v}, i = 1,2. Show that o(Xi) = o(X2). 

25(c) Hint: If all else fails, try induction. 

27 # is an (8, 16, 4) code. 

29(b) Hint: Why is it enough to show that C(23, 0) + C(23, 1) + C(23, 2) + 
C(23,3) = 2 11 ? 



6.3. Latin Squares 

I Hint: Explain why 1 + 2 + • • • + n 2 = nm. 

3 Hint: Begin by setting x = 0, y = 1, and z = 2. 
5 Use A and 5 from Fig. 6.3.6 together with 

2 3\ 
1 0 

3 2 ' 
0 1/ 

7 Hint: Suppose p, g, h S G satisfy /?g = pfe. 
9 Hint: Theorem 6.3.6 and Exercise 8. 

II Hint: If A, P, and 2 are nx n matrices, then det(PAQ) = det(f) x 
det(A) det(g). 

13 Because m = 6 = 4(1) + 2 and 3 = 4(0) + 3 is a prime factor of the square- 
free part of 6, it follows from the Bruck-Ryser theorem (and Theorem 6.3.16) 
that there does not exist a family of five pairwise orthogonal Latin squares of 
order 6. 



C = 



f° 1 

3 2 

1 0 

\ 2 3 



15 Hint: A recipe can be found in the proof of Theorem 6.3.8. 
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17(a) A®B- 
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17(b) A^fi: 



19 Hint: Exercise 18 and Theorem 6.3. 1 

21(a) Hint: Exercise 6. 

/0 1 2 3^ 

21(c) A = 



12 3 0 
2 3 0 1 
\3 0 1 2/ 



6.4. Balanced Incomplete Block Designs 

1 The die model comes close to being a BIBD. However, with respect to the 
standard numbering of dice, face 1 and j share two vertices, 2 < j < 5, but faces 
1 and 6 share none. 

3(a) Hint: The dependent parameter, q, is an integer. 

3(b) Hint: The dependent parameter, b, is an integer. 

5(b) The triple of parameters for S c is (v, v — k, b + X — 2q). 

5(d) The complement of the design afforded by the finite projective plane of order 
2 is a (7,4,2)-design. 

5(e) Hint: Figure 6.4.2. 

7(b) Hint: Mimic the proof of Theorem 6.4.12 using part (a). 

7(c) Hint: Part (b) and the assumption that v > k. 

7(d) Hint: Part (c). 

7(e) Hint: Part (d). 

9(a) Hint: Example 6.4.8. 

11(a) Show that AA' = 3/ n + 3/ n . 

11(b) Hint: Theorem 6.4.19. 
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15 Hint: Show that det(ffi/<) = n". 
17 Hint: Theorem 6.4.19. 
19(a) Hint: Exercise 18. 
19(c) Hint: Exercise 5. 
21(b) Hint: Exercise 18. 

21(c) Hint: Of the n 2 entries in a Hadamard matrix of order n, how many are 
equal to 0? 

23 Hint: Corollary 6.4.7 and Theorem 6.4.19. 
27 Hint: A = \{K + J n _{). 



Appendix A2 Sorting Algorithms 

1(a) 1. Input Wand set S = 0. 

2 . For 1= 1 to N. 

3. S= S+ I. 

4. Next I. 

5. Writes. 

1(b) 1. Input W. 

2 . Write Nx (N- l)/2 . 

3(a) Hint: Since "Big Oh" involves an upper bound, it suffices to consider a 
"worst-case" scenario. 



5 1. 


L = M= 10. 


2 . 


For J = 1 to M. 


3. 


"Read" ( from data steps ) X(I). 


4. 


Next I. 


5. 


For J = 1 to L. 


6. 


Read Y (I) . 


7 . 


Next I. 


8. 


J= K= T= 1. 


9. 


If X( J) > Y ( JO then go to step 13 . 


10. 


A(T) = X(J) . 


11. 


J= J+ 1 and T= T+ 1. 


12. 


Go to step 15 . 


13. 


A(T) = . 


14. 


K= K+l and T= T+ 1. 


15. 


If J> M then go to step 18 . 


16. 


If iC>Lthen go to step 22. 
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17 . Go to step 9 . 

18. For I = K to L . 

19. A (M+ I) = Y( I) . 

20. Next J. 

21 . Go to step 25 . 
22 . For I = Jto M. 

23. A(L+ I) = X( J) . 

24. Next J. 

25 . For 1=1 to Af + L. 

26. Write A(J). 

27 . Next J. 

28. DATA 1,2,3,4,4,4,5,5,5,7 

2 9 . DATA 2,3,3,4,5,5,6,6,8,9 

9(a) 1. Input W. 

2 . For I = 1 to N. 

3 . J? ( J) = [ 1000 x RNDJ . 
3.1. Write . 

4. Next J. 

4.1. Start = Time. 

5. T=0. 

6. For J=ltoI-l. 

7. If R (J) < R(J+ 1) then go to step 12 

8. T=l. 

9. X=R(J). 

10. i?( J) = i?( J+ 1) . 

11. R(J+ 1) = X. 
12 . Next J. 

13. If T = 1 then go to step 5 . 

14 . For I = 1 to Af. 

15. Write R(I). 

16 . Next J. 

17. Write Time - Start . 
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1*1 

L J 


greatest integer < x 


38 


w 


least integer > x 


344 




about equal 


65 


At 


classical adjoint (adjugate) 


498 


AB 


line incident with A and B 


453 


A\B 


complement of B in A 


42 


A c 


complement of A (in E) 


25 


a{G)=K-i{G) 


algebraic connectivity 


402 


A(G) 


adjacency matrix 


387 


A m 


alternating group of degree m 


194 




matrix of mystery coefficients 


49 


{a n } 


sequence ao, a\,ci2,..- 


254 


A' 


transpose of matrix A 


397 




mystery coefficient 


47ff 


B„ 


wth Bell number 


132 


X(G) 


chromatic number 


360 




dual code 


426 


cip) 


number of cycles in p 


222 


Clip) 


no. cycles of length (' in p 


233 


c n 


Pascal matrix 


49 


c„ 


cycle graph 


383 


r 


"colorful" clone of F„ h „ 


219 


C{n,r) 


M-choose-r 


10 


C(u, n) 


extended binomial coefficient 


285 


C p {x) 


cycle of p containing x 


155 


8,v 


Kronecker-delta 


498 




a n+l ~ a n 


225 
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A k+l a n 


A k a„+i — A k a„ 


256 


Af(n) 


fin+1) - fin) 


255 




BIBD 


463 


det(A) 


determinant of matrix A 


227, 497 


d(G) 


degree sequence 


341 


D(G) 


diagonal matrix of vertex degrees 


398 


din) 


no. divisors of n 


312 


D(n) 


derangement number 


141 


D n 


dihedral group 


207 


d{b,w) 


distance from b to w 


34 


d(u, w) 


distance from u to w 


364 


d(v) = dn(v) 


degree of vertex v in G 


341 


E(G) 


edge set of graph G 


338 


E r {x \ , X2 , . . . , x n ) 


elementary symmetric function 


88 




identity of S m 


178 


efn, t) 


E, (1, 2, . . . , n) 


90 


F 


{0,1} 


34 


pn 


set of binary words of length n 


34 


Fin) 


Ferrers diagram of partition n 


79 


/(n) 


trace of partition n 


409 


f(D) 

■J \ / 


imagef f ) — if fx) : x € D\ 


175 


Pip) 


number of fixed points of p 


197 




{fh:heH} 


190 




set of all functions from {1,2, ... ,m} to {1,2,.. 


.,«} 118 




\x : fix) = y) 


120 


r 1 




177 


Y.-(G) 


adjacency (graph) eigenvalue 


392 


G = (V,E) 


Graph G 


338 


G c 


complement of graph G 


343 


G d 


dual pseudograph 


378 


G d 


graph obtained from G d 


379 


G t 


\p € G : p(x) = x) 


189 


Gi + G 2 


graph union 


360 


Gi V G 2 


graph join 


361 


G - e 


edge deleted subgraph 


357 


G/e 


graph obtained from G — e 


358 


g o f = of 


composition of g and / 


176 




nondecreasing functions in F m/1 


125, 245 


g{n, m) 


number of graphs 


351 


G — u 


vertex deleted subgraph 


385 


G[W] 


subgraph induced on W 


348 


x = y(mod G) 


equivalence modulo G 


195 
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Jf m Hamming code 433 

H m parity check matrix for m 432 

H n (xi,X2, ■ ■ ■ ,Xk) Homogeneous symmetric function 245 

fli {fh:heH} 190 

image(/) f(D) where D = domain(/) 175 

/„ identity matrix 498 

/„ v x v matrix of l's 466 

K n complete graph 343 

K s j complete bipartite graph 361 

ker(A) kernel of matrix A 496 

Xi(G) Laplacian (graph) eigenvalue 401 

£(n) length of partition n 77 

L(G) Laplacian matrix 398 

if (5) linear span of set S 424, 496 

|x(G) matching number 383 

Mobius function 313 

M(G,x) matching polynomial 384 

d | n d divides n 312 

M r (x\ , X2, ■ ■ ■ , x„) rth power sum 95 

(n,M,d) binary code parameters 35 

N(n,r) C(n,0) + C{n,l) + --- + C{n,r) 36 

N(s, t) Ramsey number 349 

( ) multinomial coefficient 5 
\n,r 2 ,...,r k J 

(;) b _ c( „. r) 

co(G) clique number 367 

o(E) cardinality of set E 24 

o{p) order of permutation p 186 

O x {p(x) : x e G} 195 

0(g(n)) Big Oh 489 

q> golden ratio 282 

(p(«) Euler totient function 147 

n h n n is a partition of n 11 

n* partition conjugate to n 19 

p induced action p 212 

p induced action p 219 
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(p) 

YrV 


cyclic subgroup generated by p 


188 


P(A) 


probability of A 


24 


P(B 1 A) 

V 1 / 


probability of B given A 


26 


r>er(A) 


permanent of matrix A 


227 


z?(G, x) 


chromatic polynomial 


357 


l J tn\ fL ) 


number of f??-r)art partitions of n 


78 


p(n) 


number of partitions of n 


78 


Pn 


path graph 


383 


P„ 


n x n power matrix (i-') 


49 


/>- 


(p-r = (pt 1 


193 


P(n, r) 


r\C{n, r) 


57 


n j- , f f rj ) 


no. distinct partitions of n 


291 


/Ir.JJ ( IT ) 

jroaaW 


no. odd-part partitions of n 


292 


G(G) 


oriented v x e incidence matrix 


396 


e< 


transpose of matrix Q 


397 


q(G,r) 


number of r-matchings 


384 


Qm.n 


increasing functions in F m „ 


119 


u 


real numbers 


265 


s 1 - 


orthogonal complement of S 


426 


s(G) 


Laplacian spectrum 


401 


s(m, n) 


Stirling number of the 1st kind 


159 


S(m, n) 


Stirling number of the 2nd kind 


122 


c(2) 


pair group 


246 


s„ 


permutations in F n/1 


141 


Sv 


permutations of V 


181 


S r (w) 


sphere of radius r 


36 


t(G) 


snanninp tree number 


400 


tr(A) 


trace of matrix A 


100 


y(2) 


2-element subsets of V 


246, 338 




vertex set of graph G 


J JO 


V 


transpose of v 






rlinne nnmher 


367 


W x 


orthogonal complement of W 


497 


wf(i<) 


weight of binary word u 


423 


w(/) 


weight of coloring/ 


230 




w(f), f eP 


231 


W G (xi,x 2 , ■ ■ 


. ,x n ) pattern inventory 


231 


1(G) 


chromatic number 


360 
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x (n) 




falling factorial function 


90 


x = y (mod G) 


equivalence modulo G 


195 






n x 1 matrix of l's 


399 


cm 




Riemann zeta function 


311 






cycle index polynomial 


242 


7 




cycle index polynomial for S m 


243 
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Abbott, E. A. 372, 501 
abundant number 84 
acyclic 

graph 389 

orientation 396 

polynomial 384 
adjacency matrix 387, 403 
adjacent 

edges 338 

vertices 338 
adjoint see classical adjoint 
adjugate 398, 498 
algebra of formal power series 272 
algebraic connectivity 402, 405 
al-Khowarizmi, Mohammed ben Musa 100 
alternant hydrocarbon 361 
alternating group see group, alternating 
alternating sign theorem 49ff, 259 
al-Tusi 12 
Andrews, G. E. 501 
antiregular graph 417 
Anton, H. 501 
Apianus, Petrus 12 
Appel, Kenneth 378 
arithmetic sequence 53, 254 
ASCII 39-40 

Association for Women in Mathematics 8 
astragali 65 

Balakrishnan, V. K. 501 

balanced incomplete block design 463 

barcode 5ff 

Barnard, Fred R. 152 

Basis of a 

linear code 424 

vector space 496 



Bayes, Thomas 27 
Bayes's First Rule 26 
Beineke, L. W. 502 
Bell, E. T. 132 

Bell numbers 132ff, 172, 20 Iff, 30 Iff, 

310, 321 
Benzene 240 

Berkeley, (Bishop) George 27 
Bernoulli, Jakob 54 
Bernoulli numbers 54ff, 76, 
318 

Berra, Yogi 279 

BIBD see balanced incomplete block 

design 
Big Oh 489 
Biggs, N. L. 501 
binary 

code 34, 419, 462ff 

operation 1 80 

word 34, 112-113 
binomial 

coefficient 43 

probability distribution 29 

theorem 66 
bipartite graph 361, 393 
bipartition 361 
Birkhoff, G. 360 
birthday paradox 505 
bit 5, 34 
Blake, William 87 
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of a design 463 

of a graph 393 

of a partition 121 
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Boolean 

arithmetic 422, 495 

linear combination 424 

vector space 423 
Bose, R. C. 139,451 
Bose-Einstein model 139 
boundary conditions 321 
Bressoud, D. M. 501 
Brooks, R. L. 368 
Brualdi, R. A. 501 
Bruck, R. H. 458, 466 
Bruck-Ryser Theorem 458-9, 537 
Bruck-Ryser-Chowla Theorem 466, 475 
Bryan, William Jennings 394 
Budapest 19 

Burnside, William 197, 501 
Burnside's Lemma 197, 230 
Busby, R. C. 501 
by the numbers 340 
byte 39 

Caesar cypher 126 
Cameron, P. J. 202 
cardinality 10, 21 
Cartesian product 394 
Catalan, Eugene 17 

Catalan numbers (sequence) 17-18, 296, 
334 

Cauchy, Augustin-Louis 197 
Cauchy-Binet Determinant Theorem 400, 
500 

Cauchy's identity 251 

Cay ley, Sir Arthur 180, 377 

Cayley table 180ff, 459 

characteristic polynomial of a 

homogeneous linear recurrence 323 
matrix 335, 388, 401, 407, 498 

characteristic roots 498ff 

Chebyshef, Pafnuti 387 

Chebyshev polynomials 387 

check digit (bit) see parity check digit 

chi-squared 30 

Chowla, S. 466 

chromatic 

number 360, 391, 417 
polynomial 360, 384, 393, 403 
reduction 358, 385 

Chu Shih-Chieh 12, 45, 54 

Chuck-a-Luck 22 

Chu's Theorem 45ff, 126 

classical adjoint 398, 498 

clique 348 

clique number 367, 417 



closed formula 254, 271, 279 
closure property 181 
coalesced vertices 358 
coalescence 368 
codebook 438 
Cohen, D. I. A. 501 
Colbourn, C. J. 501 
color pattern see pattern 
coloring 218 
companion matrix 335 
complement 

of a BIBD 472 

of a binary code 40 

of a binary word 40 

of a graph 343 
complete 

bipartite graph 361 

family of mutually orthogonal 
Latin squares 452 

graph 343 
component of a graph 342 
composition of 

functions 176 

permutations 178 

positive integers 60, 282 
conditional probability 26 
conjugate of a partition 79 
connected graph 342, 373, 392, 402, 405 
constant weight code 40, 462ff 
Constantine, G. M. 501 
convex sequence 265 
Cook, S. A. 342 
coordinate representation 499 
Corneille, Pierre 117 
coset of a 

permutation group 190 

vector space 497 
covered vertex 383 
covering 

number 391 

of a graph 391 
crossing edges see edge crossings 
cuboctahedron 217 
cut-vertex 393 
Cvetkovic, D. M. 501 
cycle 

directed 395-6 

graph 383 

in a graph 362 

in a permutation 155 

index polynomial 242ff, 300 

nontrivial 185 

permutation 185 



Index 



549 



structure 157, 213 
type 157 
cyclic group 188 

de Mere, Chevalier 31 
de Morgan, Augustus 377 
de Parville, H. 333 
decode 34 

decomposition see composition 
deficient number 315 
degree 

of a permutation 184, 186 

of a permutation group 1 8 1 

of a vertex 338, 341 

sequence 341 
Delacroix, Eugene 100 
Democritus 337 
dependent parameters 464 
derangement 141 

derangement number 1 4 1 f f , 203, 228, 243, 

307, 317 
Descartes, Rene 182 
determinant 227, 497, 499 
Dewar, James 347 
diameter of a graph 364 
dictionary order 105, 109, 119, 480 
difference 

array 255 

sequence 255 
dihedral group 207, 242 
dimension 424, 496 
Dinitz, J. H. 501 
directed 

arc 395 

cycle 395-6 

graph 395 

path 395 
Dirichlet generating function 31 Off 
Dirichlet, Peter Gustav Lejeune 310 
disconnected graph 346 
discrete derivative 255, 265, 317 
disjoint cycle factorization (notation) 154ff 
distance between 

binary words 34 

vertices in a graph 364, 394 
distinct partitions 291-2, 415 
Dobinski, G. 203 
Dobinski's formula for the Bell 

numbers 203, 283 
dodecahedron 216 
domain 118 
Doob, M. 501 
dot product see scalar product 



double precision 493 
doubly transitive 199 
dual of a 

BIBD 472 

binary code 426, 497 

projective plane 453 

pseudograph 379 
duality principle 453 

Edgar, Hugh 205 
edge 

chromatic number 356 

connectivity 392 

crossings 340, 372 

of a graph 338 

of a polyhedron 21 Off 

subgraph 357 
Efron, Bradley 32 
eigenvalue 392, 40 Iff, 498 
eigenvector 498 
Einstein, A. 139 
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number 90ff, 129 

row operations 495 

symmetric function 88f, 112, 120, 128, 
166, 251, 298ff, 318, 401, 477, 498 
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empirical probability 114 
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class 133 

relation 133 
equivalent 

codes 42, 442 

colorings 218 

cycles 154ff 

Latin squares 459 

modulo G 195, 223, 231 
Erdos, P. 344 
Erdos's theorem 356 
error 

correcting code 34, 464ff 
pattern 437 
Euclid 101 

Euclidean algorithm 101 

Euler, Leonhard 17, 151, 217, 449, 451 
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totient function 1 47ff , 3 1 8 
Euler's 

formula 217, 373 

magic square 449 

pentagonal number theorem 293 

theorem 151 
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extended binomial coefficient 285, 295 
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265, 357, 367, 369 
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Fermat's little theorem 74, 140, 151 
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Ferrers, Norman Macleod 79 
Fibonacci 19, 320 
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152, 264, 281, 295, 320, 331, 333, 394 
Fiedler, Miroslav 402, 405 
finite projective plane 454 
first theorem of graph theory 341, 408 
five-color theorem 376 
fixed point 141 
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formal 

derivative 275 

power series 271 
four-color theorem 377 
Franklin, Benjamin 447 
Franklin's magic square 449 
freeze-dried expression 269ff 
Frobenius, Georg 197 
Frost, Robert 66, 76 
Fuller, R. Buckminster 194, 216, 379 
fullerene 216, 379 
fundamental counting principle 2ff 
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of arithmetic 6, 154 
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480 

Galilei, Galileo 87, 267 

Garey, M. 342, 502 

Gauss, Carl Friedrich 45-6, 285 

Gauss-Jordan elimination 495 

general solution 323 
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generating 

function 165, 244, 268, 351, 415 

matrix 429 

set of codewords 424 
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genus of a graph 382 
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dual 378 

sequence 268 
Golay code 437, 447, 473 
golden ratio 282, 295 
Golomb, S. W. 127 
Gore, Al 100 

Graham, Ron 202, 349, 501-2 
graph 338 

eigenvalues 392, 401 
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join 361 

union 360 
graphic partition 408ff 
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Grone, R. D. 403 
Grotschel, M. 202 
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abstract 189, 232 

alternating 183-4, 194, 202 

permutation 181 
Gutenberg, Johann 421 
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design 470 

matrix 468ff 
Haken, Wolfgang 378 
Hall, Marshall 502 
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cycle 370-1 

graph 370 
Hamming code 39, 433ff, 444ff 
Harary, Frank 234, 502 
Hardy, G. H. 1,218,311,502 
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Hasselbarth, W. 410 
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head of an oriented edge 395 
Heawood, Percy 378 
Heilmann, O. J. 384 
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normal form 428, 433, 495 

polynomials 387 
Hoare, C. A. R. 485 
Hoffman, D. G. 502 
Holden, A. 502 
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homeomorphic graphs 375 
homogeneous 

linear equations 497 

linear recurrence 273, 302, 321ff, 326 

polynomial 71 

symmetric function 85-6, 98, 125ff, 
237, 240, 245ff, 299, 300, 318 
Hosoya, H. 384 
Hosoya topological index 394 

identity 

matrix 498 

permutation 178 
image 118, 175 
incidence matrix of a 

BIBD 464 

graph 419 

finite projective plane 461 
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independence number 391 
independent 

edges 383 

outcomes/events 27 

vertices 348, 361, 383, 417 
induced 

action 205, 212, 246 

clique 364 

subgraph 348 
initial conditions 320 
insertion sort 488-9 
integrating factor 303 
interval graph 418 
invariant of a graph 340 
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function 177 
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inversion number 87, 151-2, 301 
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isomorphic 

games 448 

graphs 339, 413 
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isomorphism problem 340 

James, William 461 
Jefferson, Thomas 383 
Johnson, David 342, 502 
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kernel 428-9, 496 
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matrix 398, 403 

spectrum 40 Iff, 406ff, 416 
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walk 394 
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function 499 
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